Skip to main content

Documents Computer Science comp565_fall2023_A1.pdf

comp565_fall2023_A1

.pdf

School

McGill University *

*We aren’t endorsed by this school

Course

565

Subject

Computer Science

Date

Dec 6, 2023

Type

pdf

Pages

4

Uploaded by stephenlu2002 on coursehero.com

Assignment 1 COMP 565 ML in Genomics and Healthcare This assignment is worth 8% of your total grade and due at midnight on September 25, 2023 Question 1 [2%] Implementing LD score regression For a phenotype of interest, we have collected the marginal statistics ˜ β for M = 4268 SNPs and the M × M LD matrix R (i.e., pairwise SNP-SNP Pearson correlation). The marginal statistics are based on N = 1000 individuals. Download the marginal statistics and LD matrix from here: https://drive.google.com/drive/folders/1tq4bTdbsv1iwO4wHxq1smzoN9D5luapp?usp=sharing For this question, you may also assume there is no population stratification in this dataset. Both phenotype and genotype were standardized. Implement the very basic LD score regression algorithm with a programming language of your choice (preferably Python or R) to estimate the heritability of the phenotype. What’s your estimate of the heritability? Submit your answer to this question in iPython notebook with name COMP565 A1 ldsr.ipynb or R Markdown COMP565 A1 ldsr.Rmd on MyCourses. This way the TA can run your code to validate its output. Do not submit the data provided to you as long as you have the clear path to the data you run. Question 2 [6%] Bayesian fine-mapping For a phenotype of interest, we have identified a GWAS locus based on N=498 individuals, which harbour 100 SNPs. As shown in Figure 1, because of the extensive LD, identifying the 1

Figure 1: Manhattan plot for the GWAS locus to finemap. The causal SNPs are in fact coloured in red although in practice we will know which SNPs are causal. causal SNPs based on the p-values of the z-scores alone is error prone. Because this is an as- signment, I have highlighted the causal SNPs namely rs10104559, rs1365732, rs12676370 but of course in real world applications, we will not know them. Download the marginal z-score and LD matrix from here: https://drive.google.com/drive/folders/1tr7BCceyIcKxiO_i6iCNjvk44HHpImgG?usp=sharing Your task is to implement a simplified version of the FINEMAP algorithm discussed in Lecture 5. To make the task easier, you may assume there are maximum 3 causal SNPs in the locus. You can divide the tasks into four small tasks: 2

Your preview ends here

Eager to read complete document? Join bartleby learn and gain access to the full version

Access to all documents
Unlimited textbook solutions
24/7 expert homework help

Related Questions

Single Point based Search: Fair share problem: Given a set of N positive integers S={x1, x2, x3,…, xk,… xN}, decide whether S can be partitioned into two sets S0 and S1 such that the sum of numbers in S0 equals to the sum of numbers in S1. This problem can be formulated as a minimisation problem using the objective function which takes the absolute value of the difference between the sum of elements in S0 and the sum of elements in S1. Assuming that such a partition is possible, then the minimum for a given problem instance would have an objective value of 0. A candidate solution can be represented using a binary array r=[b1, b2, b3,…, bk,… bN], where bk is a binary variable indicating which set the k-th number in S is partitioned into, that is, if bk =0, then the k-th number is partitioned in to S0, otherwise (which means bk =1) the k-th number is partitioned in to S1. For example, given the set with five integers S={4, 1, 2, 2, 1}, the solution [0,1,0,1,1] indicates that S is…

Do the follow by using jupyter notebook. Implement a function to solve the multilinear regression problem for a given vector y of dependent values and a matrix X of independent values. Your function should return the least-squares solution for the parameter vector, βˆ. Hint: Be sure to add a column of all 1’s to your X matrix for the intercept term. Hint 2: See the SVD example code for matrix operations using numpy. In addition to those, you will need to perform a matrix inverse, which you can do with numpy.linalg.solve

We have a corpus and the total number of documents within is 1, The following words occur in the following number of documents: ”machine” occurs in 32 documents ”learning” occurs in 16 documents ”software” occurs in 8 documents ”computer” occurs in 64 documents ”robust” occurs in 1,024 documents Please calculate the TF-IDF weighted term vector for the following document D. Assume that the log in the IDF weight is taken to the base 2. (Hint: all the numbers above are powers of 2). ”machine learning software robust computer software”

We have a corpus and the total number of documents within is 1, The following words occur in the following number of documents: ”machine” occurs in 32 documents ”learning” occurs in 16 documents ”software” occurs in 8 documents ”computer” occurs in 64 documents ”robust” occurs in 1,024 documents Please calculate the TF-IDF weighted term vector for the following document D. Assume that the log in the IDF weight is taken to the base 2. (Hint: all the numbers above are powers of 2). Document D: "machine learning software robust computer software"

Question 2) We have N jobs and N workers to do these jobs. It is known at what cost each worker will do each job (as a positive numerical value). We want to assign jobs to workers in such a way that the total cost of completion of all jobs is minimal among other possible alternative assignments. For this problem, write the algorithm as pseudocode, whose input is a matrix representing worker/job costs, and the output is a list of tuples showing which work will be done by which worker, and that tries to reach the solution with GREEDY technique. Explain in what sense your algorithm exhibits greedy behavior. What is the time complexity of your algorithm? Interpret if your algorithm always produces the best (optimum) result for each instance of the problem.

Answer the following: This problem exercises the basic concepts of game playing, using tic-tac-toe (noughts and crosses) as an example. We define Xn as the number of rows, columns, or diagonals with exactly n X’s and no O’s. Similarly, On is the number of rows, columns, or diagonals with just n O’s. The utility function assigns +1 to any position with X3=1 and −1 to any position with O3=1. All other terminal positions have utility 0. For nonterminal positions, we use a linear evaluation function defined as Eval(s)=3X2(s)+X1(s)−(3O2(s)+O1(s)). a. Show the whole game tree starting from an empty board down to depth 2 (i.e., one X and one O on the board), taking symmetry into account. b. Mark on your tree the evaluations of all the positions at depth 2. c .Using the minimax algorithm, mark on your tree the backed-up values for the positions at depths 1 and 0, and use those values to choose the best starting move. Provide original solutions including original diagram for part a!

Answer the following: This problem exercises the basic concepts of game playing, using tic-tac-toe (noughts and crosses) as an example. We define Xn as the number of rows, columns, or diagonals with exactly n X’s and no O’s. Similarly, On is the number of rows, columns, or diagonals with just n O’s. The utility function assigns +1 to any position with X3=1 and −1 to any position with O3=1. All other terminal positions have utility 0. For nonterminal positions, we use a linear evaluation function defined as Eval(s)=3X2(s)+X1(s)−(3O2(s)+O1(s)). a. Show the whole game tree starting from an empty board down to depth 2 (i.e., one X and one O on the board), taking symmetry into account. b. Mark on your tree the evaluations of all the positions at depth 2. c .Using the minimax algorithm, mark on your tree the backed-up values for the positions at depths 1 and 0, and use those values to choose the best starting move. Provide original solution!

Step 1. Intersection over Union # def intersection_over_union(dt_bbox, gt_bbox): ---> return iou Step 2. Evaluate Sample We now have to evaluate the predictions of the model. To do this, we will write a function that will do the following: Take model predictions and ground truth bounding boxes and labels as inputs. For each bounding box from the prediction, find the closest bounding box among the answers. For each found pair of bounding boxes, check whether the IoU is greater than a certain threshold iou_threshold. If the IoU exceeds the threshold, then we consider this answer as True Positive. Remove a matched bounding box from the evaluation. For each predicted bounding box, return the detection score and whether we were able to match it or not. def evaluate_sample(target_pred, target_true, iou_threshold=0.5): # ground truth gt_bboxes = target_true['boxes'].numpy() gt_labels = target_true['labels'].numpy() # predictions dt_bboxes =…

Weighted Interval Scheduling & Dynamic Programming (Knapsack, Edit Distance) Suppose you are in the middle of a pandemic. Given a list of daily case counts to analyze, one would like to identify periods of high growth in the cases. One way to do is to look at the change in new cases from day to day. For example, suppose we have the following data: (picture) We would like to identify the period of maximal growth. In the case above, such a period would be from Days 3 through 6, which has net growth of 47 cases. Give an algorithm in pseudocode that, when given a list of daily "changes" in case rates, identifies the period of maximal growth. Give proofs of correctness and running time for your algorithm.

(Code in R language) Consider the data presented in the Trades.csv file (table given below). This file represents 35 days worth of data from a brokerage house that is trying to predict the number of trade executions per day as a function of the number of incoming phone calls to the Set up a scatterplot of the Determine the fitted regression equation for this data, and use it to predict the number of trade executions that will occur if there are 2300 incoming calls to the If the firm receives 100 more calls on Tuesday than they did on Monday, how many more executions should they expect? Suppose the CFO asks you to predict the number of executions when there are 3500 incoming calls. What should you say? (Trades.csv file) Day Calls Executions 1 2591 417 2 2146 321 3 2185 362 4 2245 364 5 2600 442 6 2510 386 7 2394 370 8 2486 376 9 2483 463 10 2297 389 11 2106 302 12 2035 266 13 1936 339 14 1951 369 15 2292 403 16 2094 319 17 1897 306 18 2237 397…

Predicting Housing Median Prices. – The file BostonHousing.csv contains information on 506 census tracts in Boston, where for each tract multiple variables are recorded. The last column (CAT.MEDV) was derived from MEDV, such that it obtains the value 1 if MEDV > 30 and 0 otherwise. First, consider the goal of predicting the median value (MEDV) of a tract, given the information in the first 12 columns. Second, consider the goal of classifying the property using the last column of CAT.MEDV. Partition the data into training (60%) and validation (40%) sets. a1. Perform a knn prediction with all 12 predictors (columns 1 – 12) with MEDV (column 13) as the outcome variable. (Ignore the CAT.MEDV column in this step.) Try values of k from 1 to 10. Make sure to normalize the data (preprocess), and choose function knn() from the class package/library rather than FNN. [To make sure R is using class package (when both packages are loaded), use class::knn().] What is the best k? What does it…

Ma1. 1) On a Bank Reconciliation, if our check was written for $492.83 and was processed as such by the bank, but had been shown in our company's accounting records as a check for $498.23, we would code this as a C+ item.T rue or False 2) In the Bottom-Up method of calculating required revenue, we treat the amount of desired net income (once we have calculated how much it should be) as:a. a variable cost. b. a step cost. c. unnecessary for the calculation. d. a fixed item. e. none of the above. 3) A large F variance from budget in a revenue item should be investigated. True or False 4) If a Bank Reconciliation cannot be made to balance, then something unusual has occurred and must be investigated. True or False 5) In preparing a bank reconciliation, we will code an NSF check (using the fabulous Bessner system) as: a. a C+ item. b. a C- item. c. a B+ item. d. a B- item. e. none of the above. 6) If a company wants to end up with an AFTER-TAX profit of $25,000, and its tax rate is 38%,…

here is myLinReg needed to solve this problem function [a,E] = myLinReg(x,y) % [a,E] = myLinReg(x,y) % calculate the linear least squares regression to data given in x,y % Input % x: column vector of measured x data to fit % y: column vector of measured y data to fit % Output % a: vector of coefficients for the linear fit y = a(1)+a(2)*x % E: error of the fit = sum of the residual square % define a as a 2 entry vector a = zeros(2,1); n = length(x); % determine number of data points if n ~= length(y) fprintf ('Error: the length of data vectors x and y must be the same\n') a(:) = realmax(); E = realmax(); % set a and E to real max return end % calculate and store sum terms Sx = sum(x); Sy = sum(y); Sxx = sum(x.*x); Sxy = sum(x.*y); % Calculate linear equation coefficients a(1) = (Sxx*Sy-Sxy*Sx)/(n*Sxx-Sx*Sx); % a0 coefficient a(2) = (n*Sxy-Sx*Sy)/(n*Sxx-Sx*Sx); % a1 coefficient % Calculate the error of the fit E = sum((y-(a(2)*x+a(1))).^2); end

Suppose that a manufacturing company builds n different types of robots, sayrobots 1, 2, . . . , n. These robots are made from a common set of m types of materials, saymaterials 1, 2, . . . , m. The company has only a limited supply of materials for each year,the amount of materials 1, 2, . . . , m are limited by the numbers b1, b2, . . . , bm, respectively.Building robot i requires an aij amount from material j. For example, building robot 1requires a11 from material 1, a12 from material 2, etc. Suppose the profit made by sellingrobot i is pi. Write an integer linear program for maximizing the annual profit for thecompany

GD algorithm Consider Linear Regression with single variable (univariate) problem. What will be the (approximate if can’t say accurately) values of derivatives of cost/loss function ‘J’ w.r.t. all the parameters by considering one at a time, and why? What is the significance and/or usage of these θj* for the cost function ‘J’ and hypothesis ‘h’? Given a dataset where first column is the label ‘y’ while other columns represent factors ‘xi’ as follows: X = [ 1 0 1 0 1 0 ] Using GD algorithm, find the linear model. Show all the calculations

Suppose there is class of 20 students. The university has decided to give the grace for students those who have the CGPA between4.5 to 4.9 to make it 5. Identify the students those have CGPA 5.0 after adding the grace marks. Suppose students have their Roll numbers ranging from 0 to 19 & CGPA between 0-10. Add the grace CGPA to the obtained CGPA of student by 0.1 to 0.5 points. The CGPA should be assigned through random function. Input Format The input should contain an array of CGPA of the students. Constraints CGPA must lies between 1.0 to 10.0 otherwise prints "invalid input" Output Format For each test case, display the roll number and increased CGPA of those students only who lies between the obtained CGPA of 4.5-4.9. Solve this question using python program.

Suppose there is class of 20 students. The university has decided to give the grace for students those who have the CGPA between4.5 to 4.9 to make it 5. Identify the students those have CGPA 5.0 after adding the grace marks. Suppose students have their Roll numbers ranging from 0 to 19 & CGPA between 0-10. Add the grace CGPA to the obtained CGPA of student by 0.1 to 0.5 points. The CGPA should be assigned through random function. Input Format The input should contain an array of CGPA of the students. Constraints CGPA must lies between 1.0 to 10.0 otherwise prints "invalid input" Output Format For each test case, display the roll number and increased CGPA of those students only who lies between the obtained CGPA of 4.5-4.9.with help of python

Correct answer will be upvoted else Multiple Downvoted. Computer science. You are given an integer n (n>1). Your assignment is to find a succession of integers a1,a2,… ,ak with the end goal that: every simulated intelligence is completely more prominent than 1; a1⋅a2⋅… ⋅ak=n (I. e. the result of this grouping is n); ai+1 is separable by simulated intelligence for every I from 1 to k−1; k is the most extreme conceivable (I. e. the length of this grouping is the greatest conceivable). In case there are a few such groupings, any of them is adequate. It tends to be demonstrated that somewhere around one substantial grouping consistently exists for any integer n>1. You need to answer t autonomous experiments. Input The primary line of the input contains one integer t (1≤t≤5000) — the number of experiments. Then, at that point, t experiments follow. The main line of the experiment contains one integer n (2≤n≤1010). It is ensured that the amount of n…

Linear regression aims to learn the parameters 7 from the training set D = {(f(),y(i)), i {(x(i),y(i)),i = 1,2,...,m} so that the hypothesis ho(x) = ēr i can predict the output y given an input vector š. Please derive the least mean squares and stochastic gradient descent update rule, that is to use gradient descent algorithm to update Ô so as to minimize the least squares cost function JO).

5.1.3 complete answer and solution onlt no need explanation It is suspected from theoretical considerations that the rate of water flow from a firehouse is proportional to some power of the nozzle pressure. Assume pressure data is more accurate. You are transforming the data. F 96 129 135 145 168 235 p 11 17 20 25 40 55 What is the exponent of the nozzle pressure in the regression model F = apb?

Tuition($) Applicant Pool Applicant 950 76210 11040 1225 78000 10940 1325 67420 8670 1350 70380 9040 1500 62580 7410 1675 59260 7080 1800 57930 6350 1975 60130 6110 a.develop the multiple regression equation for these data. b. What is the coefficient of determination for this regression equation? c. Determine the forecast for freshman applicants for a tuition rate of $1700 per semester, with a pool of applicants of 63000. CAN YOU SHOW ME ALL THE ANSWER STEP STEP WİTH EXCELL

In R, write a function that produces plots of statistical power versus sample size for simple linear regression. The function should be of the form LinRegPower(N,B,A,sd,nrep), where N is a vector/list of sample sizes, B is the true slope, A is the true intercept, sd is the true standard deviation of the residuals, and nrep is the number of simulation replicates. The function should conduct simulations and then produce a plot of statistical power versus the sample sizes in N for the hypothesis test of whether the slope is different than zero. B and A can be vectors/lists of equal length. In this case, the plot should have separate lines for each pair of A and B values (A[1] with B[1], A[2] with B[2], etc). The function should produce an informative error message if A and B are not the same length. It should also give an informative error message if N only has a single value. Demonstrate your function with some sample plots. Find some cases where power varies from close to zero to near…

Generate 100 synthetic data points (x,y) as follows: x is uniform over [0,1]10 and y = P10 i=1 i ∗ xi + 0.1 ∗ N(0,1) where N(0,1) is the standard normal distribution. Implement full gradient descent and stochastic gradient descent, and test them on linear regression over the synthetic data points. Subject: Python Programming

J 1 Continuous Uniform distibution Suppose we are working with the Continuous uniform random variable taking values on (0,1). Define a function “cont_uni_samp” that takes input “n” and returns a random sample of size “n” from this distribution. Use the “cont_uni_samp” function and the replicate function to to get the histograms for the sampling distribution of the sample mean when working with sample sizes n = 1,2,3,4,15,500. Be sure to have appropriate titles for your histograms. What do you notice?

below is the xample file # ================= Polynomial Regression =================== # Thus far, we have assumed that the relationship between the explanatory # variables and the response variable is linear. This assumption is not always # true. This is where polynomial regression comes in. Polynomial regression # is a special case of multiple linear regression that adds terms with degrees # greater than one to the model. The real-world curvilinear relationship is captured # when you transform the training data by adding polynomial terms, which are then fit in # the same manner as in multiple linear regression. # We are now going to us only one explanatory variable, but the model now has # three terms instead of two. The explanatory variable has been transformed # and added as a third term to the model to captre the curvilinear relationship. # The PolynomialFeatures transformer can be used to easily add polynomial features # to a feature representation. Let's fit a model to these…

SEE MORE QUESTIONS

Recommended textbooks for you

Database System Concepts

Computer Science

ISBN:9780078022159

Author:Abraham Silberschatz Professor, Henry F. Korth, S. Sudarshan

Publisher:McGraw-Hill Education

Starting Out with Python (4th Edition)

Computer Science

ISBN:9780134444321

Author:Tony Gaddis

Publisher:PEARSON

Digital Fundamentals (11th Edition)

Computer Science

ISBN:9780132737968

Author:Thomas L. Floyd

Publisher:PEARSON

C How to Program (8th Edition)

Computer Science

ISBN:9780133976892

Author:Paul J. Deitel, Harvey Deitel

Publisher:PEARSON

Database Systems: Design, Implementation, & Manag...

Computer Science

ISBN:9781337627900

Author:Carlos Coronel, Steven Morris

Publisher:Cengage Learning

Programmable Logic Controllers

Computer Science

ISBN:9780073373843

Author:Frank D. Petruzella

Publisher:McGraw-Hill Education

SEE MORE TEXTBOOKS

Related Questions

Single Point based Search: Fair share problem: Given a set of N positive integers S={x1, x2, x3,…, xk,… xN}, decide whether S can be partitioned into two sets S0 and S1 such that the sum of numbers in S0 equals to the sum of numbers in S1. This problem can be formulated as a minimisation problem using the objective function which takes the absolute value of the difference between the sum of elements in S0 and the sum of elements in S1. Assuming that such a partition is possible, then the minimum for a given problem instance would have an objective value of 0. A candidate solution can be represented using a binary array r=[b1, b2, b3,…, bk,… bN], where bk is a binary variable indicating which set the k-th number in S is partitioned into, that is, if bk =0, then the k-th number is partitioned in to S0, otherwise (which means bk =1) the k-th number is partitioned in to S1. For example, given the set with five integers S={4, 1, 2, 2, 1}, the solution [0,1,0,1,1] indicates that S is…

Do the follow by using jupyter notebook. Implement a function to solve the multilinear regression problem for a given vector y of dependent values and a matrix X of independent values. Your function should return the least-squares solution for the parameter vector, βˆ. Hint: Be sure to add a column of all 1’s to your X matrix for the intercept term. Hint 2: See the SVD example code for matrix operations using numpy. In addition to those, you will need to perform a matrix inverse, which you can do with numpy.linalg.solve

We have a corpus and the total number of documents within is 1, The following words occur in the following number of documents: ”machine” occurs in 32 documents ”learning” occurs in 16 documents ”software” occurs in 8 documents ”computer” occurs in 64 documents ”robust” occurs in 1,024 documents Please calculate the TF-IDF weighted term vector for the following document D. Assume that the log in the IDF weight is taken to the base 2. (Hint: all the numbers above are powers of 2). ”machine learning software robust computer software”

We have a corpus and the total number of documents within is 1, The following words occur in the following number of documents: ”machine” occurs in 32 documents ”learning” occurs in 16 documents ”software” occurs in 8 documents ”computer” occurs in 64 documents ”robust” occurs in 1,024 documents Please calculate the TF-IDF weighted term vector for the following document D. Assume that the log in the IDF weight is taken to the base 2. (Hint: all the numbers above are powers of 2). Document D: "machine learning software robust computer software"

Question 2) We have N jobs and N workers to do these jobs. It is known at what cost each worker will do each job (as a positive numerical value). We want to assign jobs to workers in such a way that the total cost of completion of all jobs is minimal among other possible alternative assignments. For this problem, write the algorithm as pseudocode, whose input is a matrix representing worker/job costs, and the output is a list of tuples showing which work will be done by which worker, and that tries to reach the solution with GREEDY technique. Explain in what sense your algorithm exhibits greedy behavior. What is the time complexity of your algorithm? Interpret if your algorithm always produces the best (optimum) result for each instance of the problem.

Answer the following: This problem exercises the basic concepts of game playing, using tic-tac-toe (noughts and crosses) as an example. We define Xn as the number of rows, columns, or diagonals with exactly n X’s and no O’s. Similarly, On is the number of rows, columns, or diagonals with just n O’s. The utility function assigns +1 to any position with X3=1 and −1 to any position with O3=1. All other terminal positions have utility 0. For nonterminal positions, we use a linear evaluation function defined as Eval(s)=3X2(s)+X1(s)−(3O2(s)+O1(s)). a. Show the whole game tree starting from an empty board down to depth 2 (i.e., one X and one O on the board), taking symmetry into account. b. Mark on your tree the evaluations of all the positions at depth 2. c .Using the minimax algorithm, mark on your tree the backed-up values for the positions at depths 1 and 0, and use those values to choose the best starting move. Provide original solutions including original diagram for part a!

Answer the following: This problem exercises the basic concepts of game playing, using tic-tac-toe (noughts and crosses) as an example. We define Xn as the number of rows, columns, or diagonals with exactly n X’s and no O’s. Similarly, On is the number of rows, columns, or diagonals with just n O’s. The utility function assigns +1 to any position with X3=1 and −1 to any position with O3=1. All other terminal positions have utility 0. For nonterminal positions, we use a linear evaluation function defined as Eval(s)=3X2(s)+X1(s)−(3O2(s)+O1(s)). a. Show the whole game tree starting from an empty board down to depth 2 (i.e., one X and one O on the board), taking symmetry into account. b. Mark on your tree the evaluations of all the positions at depth 2. c .Using the minimax algorithm, mark on your tree the backed-up values for the positions at depths 1 and 0, and use those values to choose the best starting move. Provide original solution!

Step 1. Intersection over Union # def intersection_over_union(dt_bbox, gt_bbox): ---> return iou Step 2. Evaluate Sample We now have to evaluate the predictions of the model. To do this, we will write a function that will do the following: Take model predictions and ground truth bounding boxes and labels as inputs. For each bounding box from the prediction, find the closest bounding box among the answers. For each found pair of bounding boxes, check whether the IoU is greater than a certain threshold iou_threshold. If the IoU exceeds the threshold, then we consider this answer as True Positive. Remove a matched bounding box from the evaluation. For each predicted bounding box, return the detection score and whether we were able to match it or not. def evaluate_sample(target_pred, target_true, iou_threshold=0.5): # ground truth gt_bboxes = target_true['boxes'].numpy() gt_labels = target_true['labels'].numpy() # predictions dt_bboxes =…

Weighted Interval Scheduling & Dynamic Programming (Knapsack, Edit Distance) Suppose you are in the middle of a pandemic. Given a list of daily case counts to analyze, one would like to identify periods of high growth in the cases. One way to do is to look at the change in new cases from day to day. For example, suppose we have the following data: (picture) We would like to identify the period of maximal growth. In the case above, such a period would be from Days 3 through 6, which has net growth of 47 cases. Give an algorithm in pseudocode that, when given a list of daily "changes" in case rates, identifies the period of maximal growth. Give proofs of correctness and running time for your algorithm.

(Code in R language) Consider the data presented in the Trades.csv file (table given below). This file represents 35 days worth of data from a brokerage house that is trying to predict the number of trade executions per day as a function of the number of incoming phone calls to the Set up a scatterplot of the Determine the fitted regression equation for this data, and use it to predict the number of trade executions that will occur if there are 2300 incoming calls to the If the firm receives 100 more calls on Tuesday than they did on Monday, how many more executions should they expect? Suppose the CFO asks you to predict the number of executions when there are 3500 incoming calls. What should you say? (Trades.csv file) Day Calls Executions 1 2591 417 2 2146 321 3 2185 362 4 2245 364 5 2600 442 6 2510 386 7 2394 370 8 2486 376 9 2483 463 10 2297 389 11 2106 302 12 2035 266 13 1936 339 14 1951 369 15 2292 403 16 2094 319 17 1897 306 18 2237 397…

Predicting Housing Median Prices. – The file BostonHousing.csv contains information on 506 census tracts in Boston, where for each tract multiple variables are recorded. The last column (CAT.MEDV) was derived from MEDV, such that it obtains the value 1 if MEDV > 30 and 0 otherwise. First, consider the goal of predicting the median value (MEDV) of a tract, given the information in the first 12 columns. Second, consider the goal of classifying the property using the last column of CAT.MEDV. Partition the data into training (60%) and validation (40%) sets. a1. Perform a knn prediction with all 12 predictors (columns 1 – 12) with MEDV (column 13) as the outcome variable. (Ignore the CAT.MEDV column in this step.) Try values of k from 1 to 10. Make sure to normalize the data (preprocess), and choose function knn() from the class package/library rather than FNN. [To make sure R is using class package (when both packages are loaded), use class::knn().] What is the best k? What does it…

Ma1. 1) On a Bank Reconciliation, if our check was written for $492.83 and was processed as such by the bank, but had been shown in our company's accounting records as a check for $498.23, we would code this as a C+ item.T rue or False 2) In the Bottom-Up method of calculating required revenue, we treat the amount of desired net income (once we have calculated how much it should be) as:a. a variable cost. b. a step cost. c. unnecessary for the calculation. d. a fixed item. e. none of the above. 3) A large F variance from budget in a revenue item should be investigated. True or False 4) If a Bank Reconciliation cannot be made to balance, then something unusual has occurred and must be investigated. True or False 5) In preparing a bank reconciliation, we will code an NSF check (using the fabulous Bessner system) as: a. a C+ item. b. a C- item. c. a B+ item. d. a B- item. e. none of the above. 6) If a company wants to end up with an AFTER-TAX profit of $25,000, and its tax rate is 38%,…

SEE MORE QUESTIONS

Recommended textbooks for you

Database System Concepts

Computer Science

ISBN:9780078022159

Author:Abraham Silberschatz Professor, Henry F. Korth, S. Sudarshan

Publisher:McGraw-Hill Education

Starting Out with Python (4th Edition)

Computer Science

ISBN:9780134444321

Author:Tony Gaddis

Publisher:PEARSON

Digital Fundamentals (11th Edition)

Computer Science

ISBN:9780132737968

Author:Thomas L. Floyd

Publisher:PEARSON

C How to Program (8th Edition)

Computer Science

ISBN:9780133976892

Author:Paul J. Deitel, Harvey Deitel

Publisher:PEARSON

Database Systems: Design, Implementation, & Manag...

Computer Science

ISBN:9781337627900

Author:Carlos Coronel, Steven Morris

Publisher:Cengage Learning

Programmable Logic Controllers

Computer Science

ISBN:9780073373843

Author:Frank D. Petruzella

Publisher:McGraw-Hill Education

Database System Concepts

Computer Science

ISBN:9780078022159

Author:Abraham Silberschatz Professor, Henry F. Korth, S. Sudarshan

Publisher:McGraw-Hill Education

Starting Out with Python (4th Edition)

Computer Science

ISBN:9780134444321

Author:Tony Gaddis

Publisher:PEARSON

Digital Fundamentals (11th Edition)

Computer Science

ISBN:9780132737968

Author:Thomas L. Floyd

Publisher:PEARSON

C How to Program (8th Edition)

Computer Science

ISBN:9780133976892

Author:Paul J. Deitel, Harvey Deitel

Publisher:PEARSON

Database Systems: Design, Implementation, & Manag...

Computer Science

ISBN:9781337627900

Author:Carlos Coronel, Steven Morris

Publisher:Cengage Learning

Programmable Logic Controllers

Computer Science

ISBN:9780073373843

Author:Frank D. Petruzella

Publisher:McGraw-Hill Education

SEE MORE TEXTBOOKS