Tag Archive for statistics

Week 14

1.筆記上傳1464594396408-87662058014645944374751839384287

2.上課小考2題,以EXCEL做一遍,(1)交EXCEL ,(2)截圖上傳

P.106 3.60

141week14

P.138 4.12

142week14part2

3.針對上課的單元Ch 16 ANOVA,自己編制數據,可以是2、3、4組,每組人數4-6人,自己選擇,自己虛擬一個調查,如上學期的飲料或報紙調查,不可以是課本上的例子,參考課本646與645頁公式, 手算出ANOVA Summary table1464594487415-826192469

 

>>將內容用自己的手機、平版分享到.....

Week13

1.筆記上傳

1463910109841-87662058014639101596971839384287

2.上網蒐集有關迴歸分析與卡方檢定的相關報導,可以是學術論文,可以是市場調查,可以是調查報告..,請截圖相關重要的圖表上傳,並利用自己的話說明這些截圖的意義。

http://www.nature.com/articles/srep24136

The correlation between LDH serum levels and clinical outcome in advanced biliary tract cancer patients treated with first line chemotherapy

LDH may represent an indirect marker of neo-angiogenesis and worse prognosis in many tumor types. We assessed the correlation between LDH and clinical outcome for biliary tract cancer (BTC) patients treated with first-line chemotherapy. Overall, 114 advanced BTC patients treated with first-line gemcitabine and cisplatin were included. Patients were divided into two groups (low vs. high LDH), according to pre-treatment LDH values. Patients were also classified according to pre- and post-treatment variation in LDH serum levels (increased vs. decreased). Median progression free survival (PFS) was 5.0 and 2.6 months respectively in patients with low and high pre-treatment LDH levels (p = 0.0042, HR = 0.56, 95% CI: 0.37–0.87). Median overall survival (OS) was 7.7 and 5.6 months (low vs. high LDH) (p = 0.324, HR = 0.81, 95% CI: 0.54–1.24). DCR was 71% vs. 43% (low vs. high LDH) (p = 0.002). In 38 patients with decreased LDH values after treatment, PFS and OS were respectively 6.2 and 12.1 months, whereas in 76 patients with post-treatment increased LDH levels, PFS and OS were respectively 3.0 and 5.1 months (PFS: p = 0.0009; HR = 0.49; 95% IC: 0.33–0.74; OS: p < 0.0001; HR = 0.42; 95% IC: 0.27–0.63). Our data seem to suggest that LDH serum level may predict clinical outcome in BTC patients receiving first-line chemotherapy.

Introduction

Biliary tract cancer (BTC) is a rare group of tumors including gallbladder carcinomas and cholangiocarcinomas (extra hepatic cholangiocarcinoma, intrahepatic cholangiocarcinoma and Klatskin tumor). In Western Countries, BTC has an incidence of 1–2 cases/100.0001.

Patients diagnosed with BTC usually have a dismal prognosis with a median overall as poor as 10–12 months for metastatic or locally

Results

Globally 114 patients with advanced BTC receiving a first line chemotherapy were available for our analysis. The cut-off point with the highest sensitivity and specificity for estimating pre-treatment LDH serum levels as a function of treatment clinical activity was set at 0.89 times the upper normal range (UNR) after ROC curve analysis (Fig. 1). Consequently patients showing a pre-treatment LDH serum level <0.89 UNR were classified as LDH-low patients (56 patients, 49%, group A) whereas patients with pre-treatment LDH serum level ≥0.89 UNR were classified as LDH-high patients (58 patients, 51%, group B).

Figure 1
srep24136-f1LDH pre-treatment serum levels according to objective response to first line chemotherapy (responders vs. not responders): (a) Mann-Whitney test (p = 0.0155); (b) ROC curve analysis (p = 0.0112, cut off: ≥0.89).

3.(1)計算以下20個樣本資料的平均數與變異數

4 3 5 2
5 4 6 3
4 4 6 3
6 5 5 2
5 5 5 3

121

(2)計算每一個品牌下各5個樣本的平均數與變異數

Brand A Brand B Brand C Brand D
4 3 5 2
5 4 6 3
4 4 6 3
6 5 5 2
5 5 5 3

132

(3)計算下面4個樣本的平均數與變異數

4.8 4.2 5.4 2.6

133

week13

 

>>將內容用自己的手機、平版分享到.....

Week 12

1.筆記上傳1463247803977-7053809341463247854387-246609398

2資料如下

請按照課本p.585 的In summary去一一檢測1-4點,參考課本14.5節

  1. Linear relation ? Yes1
  2. No outliers? One outlier.
  3. Constant variance? 20 to -202
  4. Normal Distribution X~N(0,1) do not follow normal distribution since X~(0,179)3

3.參考課本Chapter 15或是Chapter 4完成p.625頁15.17。

a.4

b.

Ho: Age is unrelated to the frequency of reading newspapers in the populations.

H1: Age is related to the frequency of reading newspapers in the populations.

c.6

chi-square value: 98.48766483

p-value=1.35463579359821E-18~=0

level of significance:0.05

conclusion: Reject the null hypothesis i.e. age is related to the frequency of reading newspapers in the populations.

 
>>將內容用自己的手機、平版分享到.....

Week 11

1.筆記上傳

1462983406178-7053809341462983486159-2466093981462983533224-2048995961

2.自行選擇X與Y資料11筆,如p.956的14.53例題,但數字不可一模一樣。
利用上課示範的方式與EXCEL或PHStat的迴歸分析,看您可以複製出來與迴歸分析相同的數據,請用黃色填滿色彩標示。參考課本569, 571, 573, 574, 81, 88, 您要自行完成的部分,範例如下:
0505

(1)平台放截您的分析和軟體的迴歸分析結果圖

12

(2)另外上傳EXCEL檔

week11

3.根據影片教學,將您這次作業標示幾個標籤出來

>>將內容用自己的手機、平版分享到.....

Week 10

這次考試考不好,大都因為太緊張,不是很熟悉電腦操作,而且忘了帶課本,所以考的不是很理想。雖然有及格,但進步的空間還是很大。

進步的方法如下:

1.熟悉觀念

2.大量練習

3.教別人

4.上網看資料 outsourcing

The grade of exam is not good, mostly because too tight, not very familiar with computer operation, and I forgot to bring textbooks, so the test is not very ideal. Although I passed the exam there is still great room for improvement.
Progress is as follows:
1. be familiar with the concept of
2. a lot of practice
3. Teach others
4. Look online data outsourcing

 

 

>>將內容用自己的手機、平版分享到.....

week 9

1.13.30  complete homework version –> https://www.dropbox.com/s/1n3yxo3dznfrmvv/03154150week9.pdf?dl=0

 photos can not be uploaded don’t know why please understand

(1)

 

H0= the mean height for females who prefer to sit in the back of the room ≤ average

 

H1= the mean height for females who prefer to sit in the back of the room > average

 

(4) p-value<α=0.05 reject the null hypothesis  the mean height for females who prefer to sit in the back of the room > average

 

2.13.44

(1)

H0=mean of placebo- mean of drug=0

H1= mean of placebo- mean of drug≠0

(4)p-value>α=0.05 do not reject the null hypothesis the claim that the drug could reduce jet lag couldn’t be accepted

3.13.45

(1)

(1)H0=blood pressure before-after=0

H1=blood pressure before-after≠0

(4)p-value<αreject the null hypothesis i.e. the blood pressure is higher before seeing the dentist

4.13.60

(1)

 

H0=men’s mean time of exercising=women’s mean time of exercising

H1= men’s mean time of exercising≠women’s mean time of exercising

(4) p-value<0.05 do not reject the null hypothesis i.e. the time of exercising has no association with gender

>>將內容用自己的手機、平版分享到.....

What are p-value, null value?

P-value

p-value is computed by assuming that the null hypothesis is true. When the p value is small enough, we reject the null hypothesis so as we accept the alternative hypothesis.”small enough” is defined as p value ≤α, where α =level of significance(usually0 .05)= 1-confidence interval

Null value

Ho:population parameter =null value

Null value is the specific number.If the parameter equals that number, then the null hypothesis is true.

Two-sided alternative hypothesis:

Ha:population parameter ≠null value

One-sided alternative hypothesis (choose one)

Ha: population parameter > null value

Ha: population parameter < null value

alternative hypothesis never includes the equals sign

Example 1 one-sided hypothesis test:

If researchers wanted to find out whether men have a lower mean pulse than women, the hypotheses for this one-sided hypothesis test would be:

Ho:μ1-μ2=0(μ1=μ2)

Ha:μ1-μ2<0(μ1<μ2)

μ1,μ2 are the mean pulse rates for the population of all men and all women, and the null value is 0.

Example 2

Suppose that a null hypothesis, in words, is that the mean weight for the population of newborn babies is the same in the United States as it is in England.

Ho:μ1-μ2=0

null value =0

Example 3

A legislator who wondered whether more than 50% of the voters in her district favored a law that would reduce the legal blood alcohol level that defines drunk driving. We let p= proportion of all voters in the district favoring the lower limit. A majority is p>0.5, so the null and alternative hypotheses for this situation may be written as:

Ho:p≤ 0.5(not a majority)

Ha:p>0.5(a majority)

The null value in this instance is pο=0.5

>>將內容用自己的手機、平版分享到.....

Week 7

1.上傳上課筆記

1460167957828-5700303671460168041224-12892154611460168071819-27991167314601681129464818125181460168156109-1584033414
2.預習
(1)全文翻譯466頁最後一個Definition

The level of significance 顯著水準

用希臘字母α表示,為判定p值(p-value)是否小的足以選擇對立假設之界線值(決定臨界區)。當p值小於或等於α時,拒絕虛無假設。當p值比α大時,則無法拒絕虛無假設。顯著水準亦稱為α水準測驗。由研究者選擇。

(2)全文翻譯469頁Example 12.7全部

醫學檢驗的誤差

想像你現在正被檢查是否患病。實驗室的技術人員和內科醫生評估你的結果時,必須在兩個假設下做選擇:

虛無假設:你沒病。對立假設:你有病。

不幸的是,很多實驗室對於疾病的檢測並非100%準確。結果可能是錯的。試想兩個可能的錯誤和後果:

可能錯誤1:你被檢測出病,但你其實沒有。檢測結果為假有。

後果:你會白擔心你的健康,而且還會接受不必要的治療,可能會受苦於不利的副作用。

可能錯誤2:你有病,但被檢查出沒病。檢測結果為假無。

後果:你有病卻沒接受治療,如果此病具傳染性,你可能會傳染給別人。

哪一個錯誤比較嚴重?在大多數醫療情況中,第二個情況,假無比較嚴重,但還是依疾病和接下來一連串採取的動作判定。例如,在癌症的篩選測試中,假無的結果可能會導致致命的延誤治療。最初的癌症測試結果為陽性時,大多會再重新測驗,所以假有會趕快被找到。

(3)全文翻譯470頁Definition

型1錯誤:出現時機為虛無假設為真時。錯誤出現在把對立假設當真。

型2錯誤:出現時機為對立假設為真時。錯誤出現在無法拒絕虛無假設。
(4)全文翻譯471頁Definition

當虛無假設為真時,型1錯誤的機率和顯著水準(α水準)相同。當虛無假設不真時,無法犯型1錯誤,所以機率為0。
3.複習
(1)P.500, 12.6

a. H1:p=0.7

b.H1:p>0.45

c.H1:p<0.4

(2)P.501, 12.20

a.0.03

b.0.05

c.0.61

d.100

e.0.5

(3)P.508,12.104 based on Example 12.17 on Page 488

Step1: Determine the null and alternative hypothesis.

H0:p1-p2<=0(or p1>p2)

Ha:p1-p2>0(or p1<=p2)

Step2: Summarize the data into an appropriate test statistic after first verifying necessary data conditions are met.

  • p^1= 0.25 p^2=0.09
  • The sample statistic is p^1-p^2=0.25-0.09=0.16
  • The combined proportion is p^=(783.81+250)/(8709+1000)=0.106
  • The null standard error is null s.e.(p^1-p^2)=[0.106(1-0.106)(1/8709+1/1000)]^(1/2)=0.0102783 about 0.0103
  • z=(Sample statistic-Null value)/Null standard error=0.16/0.0103=10.3129

Step3, 4, and 5:

Z score equals 10.3129 using table A.1 we could determine the probability 0.9999999 pvalue equals 1-0.999999=0.0000001

assume alpha value equals 0.05 which is larger than pvalue, so we could reject the null hypothesis.

Capture

>>將內容用自己的手機、平版分享到.....

week5

1.上課筆記上傳

14591575668921459157598053145915764119314591576760121459157701657145915772522314591577497931459157783383
2.預習
(1)全文翻譯462頁12.1至Lesson 1之間

假設檢測總覽

任何的假設檢測(亦稱顯著性檢測)都有五個基本步驟。這些應用五個比率參數的細節在第13章會提到。假設檢測在其他情況的應用在14到16章會提到。同樣的五個步驟總會用到,儘管一些細節改變。在第4章介紹的五個步驟如下:

1.決定用於推理母體的虛無假設與對立假設。

2.將所有重要的資料核對符合後,把資料總結為適當的測驗統計。

3.比較測驗統計與期望的所有可能性,看虛無假設是否屬實,以便找出P值。

4.用P值決定結果是否具統計顯著性。

5.將統計結論文字化。

習題模型的1、2會描述五個步驟的基礎概念與定義。習題3會討論假設中影響可能性的誤差的可能的誤差及因素。
(2)全文翻譯463頁definition

虛無假設用符號H0代表,表示沒有發生任何事情。特定的虛無假設因問題而異,但大致可視為維持現狀,或沒有關聯、無差異。在大部分的情況中,研究者希望可以反駁會推翻虛無假設。

對立假設用符號H1代表,表示有事情發生。在大多數的情況中,此假設為研究者希望證明的。它可能證明現狀是假的,或者有關連、有差異。
(3)全文翻譯464頁definition

單邊假設測驗是對立假設中,用來說明從特定的「虛無」值中單一方面的參數值。單邊假設測驗亦稱「單尾假設檢定」。

雙邊假設測驗是對立假設中,用來說明從特定的「虛無」值中雙方方面的參數值。雙邊假設測驗亦稱「雙尾假設檢定」。
(4)全文翻譯466頁definition

假設測驗的檢定統計量為資料的總彙,用於評估虛無及對立假設。

p值計算方式為:假設虛無假設為真,然後斷定檢定統計量為極值,或比以對立假設角度假設的檢定統計量更極端的觀察的檢定統計量機率。
3.複習
上課例題11.60利用PHStat做一遍

Capture123456

4.下週小考,範圍Ch10與Ch11(我有教的部分)

>>將內容用自己的手機、平版分享到.....

week 4

1.記得3/16筆記上網!

1458633672470145863370966514586337400681458633765453

 

2.P.451 Q11.26完整計算過程

1458633973383

3.P.451 Q11.30完整計算過程

14586358872281458635915047

4.以PHStat4軟體做Q11.30

ss

5.全文翻譯 P.439 Lesson 2至440頁Formula前

變異數相等假設和合併標準誤

在估計兩個母體平均數的差異時,有時可以合理假設兩個母體有相同標準差。變異數就是標準差的平方,所以假設相同的標準差也就表示變異數也相同。運用統計記號,我們可以將母體變異數相等的假設記為σ1^2=σ2^2=σ^2 σ^2代表變異數的共同值。有了變異數相等的假設,兩者群體的資料合併便可以估計出 σ^2的值。用合併估計出的變異數叫作合併變異數。合併變異數的方根叫作合併標準差,計算方法如下:

Sp

將個別的標準差s1與s2用合併版的sp代進公式成為兩者平均數差異的合併標準誤:

這些或許看似複雜,但如果變異數相等假設是正確的話,它為算出乘數t提供了更簡易的數學解決方法。此情況中,自由度df=n1+n2-2

6.全文翻譯P.442 Pooled or Unpooled?

合併與否

在範例11.14中,男性與女性的樣本標準差大約相同,所以假設母體標準差相同是合理的。然而平均數差異的信賴區間會大略相等,就算沒做標準差相同的假設。在未合併的過程中,母體平均數差異的95%信賴區間為-0.10到1.03小時,和合併過的-0.103到1.025小時蠻接近的。用合併方法的一項好處就是比較簡單。

兩個獨立樣本的樣本標準差幾乎從來都不會一樣。所以我們如何得知,何時是使用合併母體平均數差異的信賴區間的合理時機?還有當母體標準差真的不同時,使用合併方式計算又有什麼風險呢?我們會仔細探導此問題,當我們在第13章講到假設測試時,但這裡我們只給初步的導引:

*如果兩個樣本標準差的巨大差異,來自群體的大樣本數,則合併版本的則傾向於產生較未合併更大的信賴區間,所以為較保守的差異估計值,就像下個例子所描述的。類似於我們為求一個比利,而用信賴區間內保守的邊際誤差值,用較保守的合併方式是可以被接受的。但是對於操做過大的區間卻不是好方法。

*另一方面,如果兩樣本標準差中較小的來自於較大的樣本,使用合併的方法可能會產生偏離的狹窄區間。

*一般來說,最好是用未合併的方式,除非樣本標準差非常相近。

 

7.詳細解釋下表黃色的數字如何得出

03162

Sample Standard Deviation= σd/n^1/2

standard error of the mean=s/n^1/2=1.5206906/(9)^1/2

interval lower/upper limit= sample mean +- t*se=25.5+- 2.3036*0.506896878

>>將內容用自己的手機、平版分享到.....