Descriptive Statistics

  • Major assignment can be done in groups of up to 3 students.
  • Use Excelto do the major assignment. Label the answers and submit it with your name(s) and student ID(s).

 

 

Don't use plagiarized sources. Get Your Custom Essay on
Descriptive Statistics
Just from $13/Page
Order Essay

Student Name(s):  ________________________________

 

 

Student Number(s):  ______________________________

 

 

 

 

 

 

 

Jennifer Nguyen, a Humber College Healthcare Management program graduate who always had only perfect marks in statistics, was hired by the famous Healthy Life medical insurance company. Jennifer is assigned to conduct statistical analysis of medical and financial data. As Jennifer is on probation, please help her to complete the following six tasks. In problems 2-6, state hypotheses H0 and H1 and provide detailed conclusions (based on P-values or critical values/test statistics) together with the Exceloutput. For your convenience the data are given in the Major Assignment Data file. You can also find useful information on the Blackboard in Excel Instructions folder.

 

 

  1. Jennifer’s manager Dr. Jonathan Steinberg, who has degrees and publications in both mathematical statistics and medical science, asked her to find estimates of the average dental claim reimbursement for 2019. As Healthy Life has many thousands of clients it is virtually impossible to calculate the population mean. Using the Excel Random Number Generator function, Jennifer found a random sample of 54 dental claims submitted to Healthy Life. The amounts covered by insurance you can see in the Major Assignment Data file. Please help Jennifer Nguyen to construct 90%, 95%, and 99%confidence intervals for the true average reimbursement. Make sure that t-distribution is applicable: build a histogram with the bin values, for example, $100, $200, $300, $400, and $500, and check whether it is approximately symmetric and bell-shaped. Then, use Descriptive Statistics functionfrom Data Analysis. Constructing confidence intervals, please round values to two decimal places.

(UseData Analysis → Histogram and Data Analysis → Descriptive Statistics)

 

Program in minitab:

MTB > Histogram ‘Amount Covered’;
SUBC> Bar;

Descriptive Statistics: Amount Covered

Variable N N* Mean SE Mean StDev Minimum Q1 Median Q3 Maximum
Amount Covered 52 0 26722 1421 10250 9500 20125 25000 34000 75000

MTB > let c6(1)=26722+c3(1)*(10250/(52^0.5))
MTB > let c6(1)=26722+c3(1)*(10250/(52^0.5))
MTB > let c6(2)=26722+c3(2)*(10250/(52^0.5))
MTB > let c6(3)=26722+c3(3)*(10250/(52^0.5))
MTB > let c7(1)=26722-c3(1)*(10250/(52^0.5))
MTB > let c7(2)=26722-c3(2)*(10250/(52^0.5))
MTB > let c7(3)=26722-c3(3)*(10250/(52^0.5)

Confidence
Row Interval Upper bound Lower bound
1 90.00% 29095.8 24348.2
2 95.00% 29574.8 23869.2
3 99.00% 30524.3 22919.7

 

 

 

 

 

 

 

 

You can use the format below to answer the questions 2, 3, 4 and 5

 

  • Step 1: Write the null and alternative hypothesis for the test

 

Ho:

Ha:

 

  • Step 2: Excel

(You have to submit the Excel output)

 

  • Step 3: Comparison and Conclusion

 

_____________________________________

 

Recall: Describing the p-Value (Excel)

  • If the p-value ≤ significance level αReject the null hypothesis H0.
  • When the p-value > significance level αDo not reject the null hypothesis H0.

(accept, support)

 

___________________________________________________________________________________

 

 

 

 

  1. Analyzing the most recent data,Jennifer’s manager Dr. Steinberg found out that in 2019 the anti-cancer prescription drug claims submitted by Healthy Life clients contained amounts considerably greater than he had anticipated. Specifically, he asked Jennifer Nguyen to check whether patients with Oncotype DX scores over 30 claimed, on average, more than $1000in 2019. Using the systematic sampling method Jennifer selected 77 patients from this category (see the Major Assignment Data file). Please help Jennifer to test the claim that the population mean annual amount of the anti-cancer prescription drug claims wasover $1000in 2019. Use Data Analysis t-Test: Two-Sample Assuming Unequal Variances and “fool” Excel Use 5% significance level.

Is it possible to reach the same conclusion at 1% significance level?

As you know, you have to make sure that the distribution of data is symmetric and bell-shaped in order to use t-distribution. Otherwise, you have to use nonparametric methods for data analysis. Please build a histogram for the data using bins $500, $1000, $1500, $2000, $2500.

(14 marks)

 

 

Objective: To determine whether patients with Oncotype DX scores over 30 claimed, on average, more than $1000 in 2019.Let  denote the average monthly claim. Wee need to test:

Vs

The appropriate statistical test to test the above hypothesis would be a One sample t test, where we compare the mean of the population to a hypothesized value. But before running this test, we must ensure that the data is approximately normally distributed. We may check this assumption by constructing a Histogram as follows:

 

 

 

We find that the distribution of data is approximately symmetric and bell-shaped and hence, we may go for the t test.

Using excel: Since excel does not facilitate one sample t test, we may create a second group of comparison, with average monthly claim $1000, by entering each value as $1000. Also, in the new column created, since the variance would be zero, we may go for an Independent sample t test with unequal variances as follows:

 

 

 

The Independent sample t test with unequal variances resulted in the test statistic value t = 2.30, with p-value 0.012 < 0.05. Since, the p-value of the test is significant, we may reject H0 at 5% level. We may conclude that the data does provide sufficient evidence to support the statement that patients with Oncotype DX scores over 30 claimed, on average, more than $1000 in 2019.

If we were to test the hypothesis at 1% instead of 5%, we would have found the test result to be insignificant (since, p-value = 0.012 > 0.01) and hence, we would not be that supportive of the claim at 1% level. As obtained in the output, we find that the test statistic t = 2.30 < 2.378 does not lie in the rejection region and hence, we fail to reject H0 at 1% level.

 

 

 

 

  1. From time to time, unfortunately, Healthy Life employees have to deal with insurance fraud. Say, some people claim medical services that have never been provided or money they never paid. To that end, Healthy Life hired a number of investigators whose functions are not much different from those of police detectives.

Doctor N.N. has been under suspicion for some time for deceiving both Healthy Life and his patients.  Healthy Life approached the provincial authorities and they agreed to launch a formal investigation and open a case given a credible evidence of fraud is provided. The Healthy Life investigation department found a number of offences.These included “up-coding” or “upgrading,” which involved billing for more expensive treatments than those actually provided; providing and subsequently billing for treatments that were not medically necessary; scheduling extra visits for patients; referring patients to another physician when no further treatment was actually necessary; “phantom billing,” or billing for services not rendered; and “ganging,” or billing for services to family members or other individuals who were accompanying the patient but who had not personally received any services.

Jennifer Nguyen took part in this investigation together with Dr. Steinberg and IanMcGillivray, a former police detective and now a Healthy Life employee. At one point, Jennifer was asked to compare the amount doctor N.N. charged for a certain medical procedure with the province average. Jennifer randomly selected a sample of forty cases (see the Major Assignment Data file). Can we support at 1% significance levelthe doctor’s widely advertised claim that his average procedure fee is waybelowthe population average$510? Use Data Analysis t-Test: Two-Sample Assuming Unequal Variances and “fool” Excel approach. Assume that the values are normally distributed.

(7 marks)

 

Based on the given data, a sample of 41 cases from the Major assignment data file:

Let  denote the average fees charged by Dr. N.N. We need to compare this average to that of the population average, a hypothesized value of

We need to test:

Vs

However, the data analysis tool pack of excel does not provide an easy approach for One sample t test, we need to resort to the available options, one of which, as mentioned in the problem is a t-test, Two sample assuming unequal variances,

But this would require two samples for comparison and at the same time we must maintain the average of the second sample at $510. Suppose, we create a hypothetical column of data with all values equal to $510 and run the test.

Thus, we would re-write the above hypothesis as:

Vs

Using excel, running a t-test, assuming unequal variances and that the data is normally distributed:

 

We find that the p-value of the test p = 0.382 > 0.01 is not significant at 1% level (Also, the test statistic t = -0.30 > -2.42 does not lie in the rejection region (t < -t0.01,40). We fail to reject the null hypothesis at 1% level of significance. We may conclude that the data does not provide sufficient evidence to support the Dr. N.N’s claim that his average procedure fee is way below the population average of $510.

 

 

 

 

  1. It appears that not all people are equally vulnerable to cancer. People from some ethnic groups have higher or lower than average chances of developing certain types of cancer. Jennifer’s manager Dr. Steinberg thinks that the company’s insurance policy should reflect this fact and constantly raises this question in the meetings. To be more convincing, he asked Jennifer Nguyen to take a random sample of those Healthy Life clients who belong to ethnic group X(at least on their maternal or paternal side) and a similar sample consisting of the clients without X Then, Jennifer was asked to compare clients’Y parameters indicating likelihood of developing W cancer. (As the matter is ticklish, we will not be naming the ethnic group, the type of cancer, the medical parameter, and the gender of sample members.) Then, it is known that in the general population Y parameters are distributed with the standard deviation of 3.0 (therefore the population varianceis equal to 9.0). However in the X ethnic group the population standard deviation is about 2.1 (and the population variance is equal to 4.41). As only 2-3% of Canadians have X ancestors, Jennifer and her manager assume that the standard deviation in the population of those who have no X ancestry is also around 3.0 as in the country in general. This assumption allows using z-distribution. The data are provided. Can we state at 2% level of significancethat it matters whether a person belongs to X group or not in terms of the risk of developing W cancer? Or the chances are equal (on average Y parameters are the same)? Help Jennifer Nguyen to conduct the test. Use Data Analysis z-Test: Two-Sample for Means.

(7 marks)

 

     

 

 

 

 

 

 

  1. Smoking is a risk factor affecting patient’s cardiovascular system (and not only cardiovascular system, of course). Jennifer was asked to take random samples of male Healthy Life members recently admitted to hospitals in GTA with heart attacks. She compiled two large separate files consisting of those male residents of Toronto who smoked and those who never smoked. The purpose of this research is to prove that the first heart attack occurs at an earlier age if a patient smoked. After carefully analyzing the data Jennifer came up with the hypothesis that male smokers who suffer a first heart attack are, on average, six years youngerthan non-smoking males who suffer a first heart attack. Please calculate sample variances for both files and decide (depending on the sample variances) what function is most appropriate: Data Analysis t-Test: Two-Sample Assuming Equal VariancesorData Analysis t-Test: Two-Sample Assuming Unequal Variances. Please help Jennifer Nguyen to conduct the test at 5% significance level. As it is known that both samples come from normally distributed populations, no histograms are required.

(11 marks)

 

 

 

 

 

 

 

The test statistic is calculated using the formula mentioned. The critical value is obtained from STATKEY (image attached for reference). We compare the test statistic with critical value  and make the required decision.

 

 

 

 

  1. Jennifer’s manager Dr. Jonathan Steinberg wonders whether Healthy Life members needed more chiropractic help in 2019 than in 2018, on average. Jenniferselected a random sample of those who were treated by chiropractic doctors in both years. Data provided. Please help Jennifer Nguyen to check whether annual expensesand number of visitsincreased, on average. For both tests use Data Analysist-Test: Paired Two Sample for Meansand 2% significance level. Jennifer Nguyen and her manager know that in both cases differences are normally distributed, so here again there is no need to build histograms.

 

 

 

The Homework Labs
Calculate your paper price
Pages (550 words)
Approximate price: -

Our Advantages

Plagiarism Free Papers

We ensure that all our papers are written from scratch. We deliver original plagiarism-free work. To guarantee this, we submit all work alongside a plagiarism report.

Free Revisions

All our papers are completed and submitted before the deadline. We ensure this to provide you with enough time to go through the work and point out any sections or topics that may need revision or polishing. We provide unlimited revision services for free.

Title-page

All papers have a title page providing your personal and institutional information. We do not charge you for this title page.

Bibliography

All papers have a bibliography or references page. This page is a requirement for academic and professional documents. We provide this page at no cost for all our papers.

Originality & Security

At Thehomeworklabs, we guarantee the confidentiality and security of your information. We value our clients and take confidentiality seriously. All personal information is treated with confidentiality and stored safely to ensure that no third parties gain access to it. We also provide original work and attach an originality/plagiarism report alongside all papers.

24/7 Customer Support

Our customer support team is available 24/7 to provide you with any necessary assistance when you need it. You can contact us at any time, day or night, via email or through the live chat button.

Try it now!

Calculate the price of your order

Total price:
$0.00

How it works?

Follow these simple steps to get your paper done

Place your order

Fill in the order form and provide all details of your assignment.

Proceed with the payment

Choose the payment system that suits you most.

Receive the final file

Once your paper is ready, we will email it to you.

Our Services

We provide our customers with the best experience in the academic and business writing field.

Pricing

Flexible Pricing

We provide the best quality of service at affordable prices. We also allow our clients to make partial payments for their orders. You can also contact our customer support team in case you need to discuss a different payment plan.

Communication

Admission help & Client-Writer Contact

We realize that sometimes clarification is necessary to ensure that quality work is done. Therefore, we provide a button for clients and writers to communicate in case some clarification is needed.

Deadlines

Paper Submission

We ensure that we submit all papers ahead of their respective deadlines. This allows you to go through the documents and request any revision, corrections, or polishing before the paper is due.

Reviews

Customer Feedback

We encourage customer feedback, positive or negative. We can identify the various areas that we need to improve to provide even better services through your feedback. Please feel free to give us feedback.