Metodologi Booklet Edited May 2012

Last Edition: 5/12/2012 12:26:49 AM
BUKU PANDUAN
KAJIAN SAINTIFIK, STATISTIK DAN PENGENALAN SPSS
Cetakan 7
Oleh: Dr Suhazeli bin Abdullah Pakar Perubatan Keluarga Klinik Kesihatan Marang Terengganu.
Sempena
Pengenalan SPSS Anjuran Jabatan Kesihatan Negeri Terengganu 15 dan 16 Mei 2012
Buku Panduan Kajian Saintifik, Statistik dan Pengenalan SPSS
ii
Hakcipata terpelihara. 2012, Suhazeli MD.
All meterial in this book and CD are NOT copyrighted by the author, it may reprinted without permission of the author. All the referrence meterials can easily downloaded through website suhazeli-files.blogspot.com. You can politely requests to reprinted or reproduce meterial from this book or CD by simply mail to me [email protected], or by snail mail at Klinik Kesihatan Marang, 21600, Marang, Terengganu.
Cetakan 1......................March 2005 Cetakan 2.....................March 2006 Cetakan 3......................May 2006 Cetakan 4......................March 2008 Cetakan 5......................April 2009 Cetakan 6......................April 2010 Cetakan 7......................Mac 2012 Cetakan 8......................Mei 2012
iii
ISI KANDUNGAN
Mukasurat vi vii vii viii 1
Sepatah Kata dari Penulis Kata Aluan Untuk Cetakan Ke-tujuh Pengakuan Kandungan Cakera Padat (DVD) BAB II. Asas Statistik Definisi Perkataan Statistik .............................................................................. 1 Apakah Epidimiologi? ...................................................................................... 3 Beberapa Pengukuran Statistik. ...................................................................... 3 Confounding .................................................................................................... 7 Measurement error and bias......................................................................... 12 Analysing validity ........................................................................................... 14 Jenis-jenis Kajian Statistik dan Design Kajian ................................................ 16 BAB III. Membuat Kajian Saintifik Research Objective ........................................................................................ 21 Study Hypothesis ........................................................................................... 22 How to search meterials ................................................................................ 22 How to do study ............................................................................................ 28 How To Make Study Topic ............................................................................. 33 Pengumpulan Data ........................................................................................ 35 Sampling Method .......................................................................................... 38 Statistik Inferens ............................................................................................ 45 BAB IV. Pengenalan Asas SPSS. 4 Pendahuluan kepada sistem analisa berkomputer. ...................................... 48 Bagaimana bermula ....................................................................................... 48 Menyimpan data bagi tujuan analisis............................................................ 49 Mencipta variabel dalam SPSS ...................................................................... 50 Memasukkan label ke dalam variabel ........................................................... 53 Transform Data [Compute & Recode] ........................................................... 55 Menjelajah (Exploring) .................................................................................. 60 Frekuensi (Frequency) ................................................................................... 65 Penjelasan Data (Descriptives) ...................................................................... 68 Impot Fail, Copy & Paste ............................................................................... 69 Select And Deselect Case ............................................................................... 74 BAB V. Analisa Parametrik Z-Test (Ujian Z) ............................................................................................... 77 Ujian Khi Kuasa Dua [2] ................................................................................ 78 Ujian T Independent ...................................................................................... 83 Ujian T Berpasangan ...................................................................................... 87 ANOVA (Analysis of Variance) ....................................................................... 90 Korelasi (Correlation) ..................................................................................... 94 Regresi (Regression) ...................................................................................... 97 Bahan Rujukan Index (Perkataan untuk dirujuk) Lampiran: Kesesuaian Ujian Statistik Dengan Jenis Variabel Langkah untuk Menganalisa data
21
48
77
100 a c e
iv
Senarai Latihan
Latihan 1: Memasukkan nama variable dan jenis dalam SPSS spreadsheet ............... 52 Latihan 2: Latihan melengkapkan variable dan label dalam SPSS spreadsheet .......... 55 Latihan 3: Sila buat pengiraan mengikut formula yang anda ketahui bagi: ................ 57 Latihan 4: Sila recode beberapa variable dibawah: ..................................................... 59 Latihan 5: Sila explore variabel-variabel seperti berikut:............................................ 65 Latihan 6: Sila dapatkan frekuensi variabel berikut .................................................... 67 Latihan 7: Cuba lakukan sendiri arahan DESCRIPTIVES bagi variabel numerikal seperti di bawah........................................................................................................................... 68 Latihan 8: Import file format Excel (yang terdapat dalam CD - nama file: Drug T Student) ke SPSS .............................................................................................................................. 70 Latihan 9: Cuba lakukan sendiri COPY & PASTE dengan menggunakan rajah dan jadual yang lain................................................................................................................................ 73 Latihan 10: Sila buat interpretasi jadual berikut ......................................................... 83 Latihan 11: Analisa file tbkkp.sav bagi pesakit tuberkulosis dengan menggunakan ilmu yang anda telah pelajari. ...................................................................................................... 89 Latihan 12: Cari p value bagi satu kajian uang berkaitan dengan obesiti di daerah setiu 93 Latihan 13: Cari corelasi antara berat badan ibu semasa mengandung dengan berat bayi semasa lahir. ................................................................................................................ 96 Latihan 14: Cari logistik regresion dalam masalah ibu merokok dan bayi SGA ........... 99
Sepatah Kata dari Penulis

Kajian saintifik adalah satu kajian yang membolehkan seseorang membuktikan hipotesis yang dipegang. Kajian saintifik tidak akan berhasil jika tidak menggunakan kaedah yang tertentu. Kaedah tersebut tidak diperolehi jika tidak belajar dengan tekun, memahami setiap langkah dan berlatih tubi. Pengalaman saya dalam menjalankan kajian saintifik semasa program master amat memeritkan. Selama penglibatan saya untuk menyiapkan thesis, saya telah bertanya kepada bermacam-macam pakar dalam statistik. Kesemuanya tidak seragam dan masih bercelaru. Kursus statistik yang saya ikuti selama seminggu semasa program master tidak memberikan pengetahuan yang mendalam kepada saya. Sehinggalah saya mengikuti kursus asas SPSS yang dijalankan oleh Dr Azmi dan Dr Mohd Rizal dari Unit Penyelidikan Perubatan, Fakulti Perubatan UKM. Kursus tersebut ringkas tetapi kandungannya padat dan mendatangkan makna yang mendalam kepada sehingga saya dapat menyiapkan thesis saya dengan sempurna. Perkara yang kedua perlu ada kepada pengkaji adalah berkawan dengan komputer. Jika kita takut untuk memegang tetikus, analisis yang kita akan lakukan mungkin manipulasi dari pemikiran kita semata-mata. Program komputer dapat mengeluarkan input yang tepat dan boleh diolah untuk menghasilkan hasil yang boleh meyakinkan pembaca yang membaca kajian saintifik kita. Penulisan dan pengolahan ayat juga memerlukan sokongan dari komputer supaya tulisan kita nampak menarik, kemas dan bergaya. Pengkaji harus belajar beberapa teknik tertentu untuk menyusun atur (format) hasil kajian. Pada kesempatan ringkas ini saya akan cuba sedaya upaya untuk menunjukkan kaedah statistik dan pengunaan software komputer untuk mencapai hasil kajian saintifik yang kita kehendaki. Inilah jalan yang saya rempuh dan jalan ini sudah terang, Cuma saya akan menyuluhnya supaya lebih nampak. Semoga anda semua tidak meraba-raba dalam lampu yang terang benderang. Perlu diingatkan bahawa pepatah statistik mengatakan bahawa If you are lost in learning statistic, which means you are in the right path. Selamat belajar!
DR SUHAZELI BIN ABDULLAH Pakar Perubatan Keluarga Klinik Kesihatan Marang [email protected] Tel: 09-6182030 Fax: 09-6184485 Saturday, May 12, 2012
vi
Kata Aluan Untuk Cetakan Ke-lapan

Bersyukur kerana dapat menerbitkan cetakan ke-7 buku ini dengan diperbaharui isi kandungannya. Akan tetapi topik yang dibincangkan adalah tidak jauh berbeza dari cetakan terdahulu. Kebanyakan data yang digunakan untuk sesi latihan adalah dengan menggunakan data daerah Setiu sendiri. Data ini adalah hasil dari beberapa kajian QAP yang kita jalankan sepanjang tahun 2005. Saya mengharapkan semua yang membaca dapat memahami dan seterusnya membantu anda dalam menjalankan kajian samada QAP, HSR, KMK atau kajian saintifik yang releven.
Dr Suhazeli Abdullah FMS Klinik Kesihatan Marang Bandar Marang Terengganu Lawati laman web suhazeli-files.blogspot.com
Pengakuan
Persediaan untuk menyiapkan buku panduan ini di ambil dari beberapa sumber. Mungkin ianya sahih atau sebaliknya. Antara sumber utama yang saya ambil adalah dari buku panduan semasa bengkel SPSS (Sesi asas dan sesi Advance) yang di anjurkan oleh Unit Penyelidikan Perubatan, Fakulti Perubatan UKM. Nota kuliah oleh Dr Azmi Mohd Tamil dan Dr Mohd Rizal Abdul manaf. Selain dari itu saya merujuk kepada beberapa bengkel yang saya ikuti, antaranya; Minggu Penyelidikan UKM, Evidence Based Medicine Course (UMMC), Workshop on Research Network Development for WONCA Asia Pacific Region. Saya akan menerima segala pembetulan atau teguran yang berkaitan dengan kesalahan yang terdapat dalam buku panduan ini. Saya boleh dihubungi melalui email atau no telefon. Bersama dengan buku panduan ini saya sertakan nota latihan dalam bentuk cakera padat. Semoga anda semua mendapat manafaat daripadanya.
vii
Kandungan Cakera Padat (DVD)

1. SPSS Versi 17 .0 2. Data file untuk kajian: sga.dbf, sga.sav, sga-korelasi.sav, sga-logistik.sav, sgapair.sav dan beberapa lagi 3. Beberapa bahan rujukan lain yang boleh anda baca. 4. Buku panduan ini dalam bentuk pdf (metodologybooklet.pdf)
viii
BAB II.
Asas Statistik
Definisi Perkataan Statistik

1. Data
Angka-angka yang didapati dengan mengukur, mencerap atau membilang objek sebenar. 2. Populasi Satu set lengkap hasil pengukuran yang dikehendaki dalam kajian 3. Sampel Subset dari populasi 4. Sampel Rawak Setiap individu dalam populasi mempunyai peluang yang sama untuk terpilih.
5. Data Mentah
Hasil cerapan yang belum diolah. 6. Saiz Sampel Jumlah individu terpilih sebagai sampel
7. Variabel
Ciri-ciri yang diukur dalam kajian.
8. Kualitatif
Data-data yang boleh diukur dalam bentuk nilai dan kumpulan
9. Kuantitatif
Data-data yang boleh diukur dalam bentuk aksara dan angka.
10. Dikotomus
Variabel yang terdiri dari dua pilihan. Seperti Ya/Tidak
11. Polinomial
Variabel yang terdiri dari banyak pilihan.
12. Bias
Ralat kajian yang boleh disebabkan oleh pengkaji, salah semasa pengumpulan data dan kesalahan format kajian.
13. Konfounders
Ralat kajian yang disebabkan oleh sampel. Biasanya ralat ini tidak boleh diubah seperti, umur, jantina dan bangsa.
14. Null Hipotesis

Satu hipotesis yang digunakan untuk menunjukkan sesuatu kajian itu mempunyai kesan yang bermakna atau sebaliknya.
15. Nilai P
Satu nilai yang digunakan untuk membezakan nilai keraguan yang bermakna.
16. Mean
Purata nomber-nombor dalan satu bahan kajian. Biasanya digunakan bagi data bertabur normal (Normal distributio)
17. Median
"Middle value" of a list. The smallest number such that at least half the numbers in the list are no greater than it. If the list has an odd number of entries, the median is the middle entry in the list after sorting the list into increasing order. If the list has an even number of entries, the median is equal to the sum of the two middle (after sorting) numbers divided by two. The median can be estimated from a histogram by finding the smallest number such that the area under the histogram to the left of that number is 50%.
18. Mode
For lists, the mode is the most common (frequent) value. A list can have more than one mode. For histograms, a mode is a relative maximum ("bump").
19. Standard Deviation and standard error of means

X + Z alpha/2 SE means infer to populations of samples M + d infer the data itself...
Apakah Epidimiologi?
Epidemiologi adalah kajian atas berapa kerap sesuatu penyakit berlaku dan kenapa. Maklumat epidemiologi digunakan untuk merancang dan menilai strategi bagi mencegah penyakit dan digunakan sebagai panduan bagi merawat penyakit Kajian epidimiologi akan dilakukan ke atas populasi yang berisiko. Untuk mengkaji subjek kepada semua populasi amat sukar. Justeru biasanya observasi kajian akan dilakukan ke atas sample kajian. Ia diambil dari populasi besar dengan beberapa kaedah tertentu. Target populasi study populasi Sample kajian Sampel Kajian
Populasi Kajian
Target Populasi
Beberapa Pengukuran Statistik.

1. Mengukur Frekuensi Penyakit 2.
Beberapa pengukuran yang biasanya digunakan dalam kajian statistik:
a)
Insiden
Insiden sesuatu penyakit adalah kadar kes baru berlaku dalam sesuatu masa dalam populasi. Contohnya, insiden thyrotoxicosis tahun 1998 adalah 10/100,000/tahun di Singapura berbanding dengan
49/100,000/tahun di Malaysia.
Secara kasarnya: Nombor kes baru Populasi berisiko X masa Atau Nombor kes baru Jumlah orang setahun yang berisiko
b)
Prevalen
Prevalen sesuatu penyakit adalah proposi (proportion) sesuatu populasi yang menjadi kes dalam sesatu masa tertentu. Contoh: Prevalen pernafasan berbunyi (wheezing child) dikalangan kanak-kanak di Sekolah rendah Malaysia dikaji pada tahun 1986 adalah dianggarkan 3%. Simptom pernafasan berbunyi ini berdasarkan jawapan ibubapa dalam kertas kajiselidik yang diedarkan. Prevalen adalah hanya pengukuran terbaik secara relatif. Tetapi ia tidak sesuai untuk penyakit-penyakit akut. Prevalen = Insiden x anggaran masa.
c) d)
Mortaliti
Mortaliti ialah insiden kematian dari sesuatu penyakit.
Hubung kait Insiden dan prevalen.

Setiap insiden baru akan masuk dalam perkumpulan prevalen (prevalence pool) dan kekal di sana sehingga sembuh atau mati. Jika kadar penyembuhan atau kadar kematian rendah, maka keparahan kronik adalah tinggi (chronicity is high). Malah, insiden yang rendah akan menghasilkan prevalen yang tinggi. Insiden adalah pengukuran terbaik untuk menentukan frekuensi penyakit. Prevalen biasanya digunakan sebagai alternatif kepada insiden dalam kajian penyakit kronik seperti multiple sclerosis.
e)
Kadar insiden yang biasa diperhatikan

Number of live births Birth rate Mid-year population
Number of live births Fertility rate Number of women aged 15-44 years Number of infant (< 1 year) deaths Number of live births Number of intrauterine deaths after 28 weeks Total births Number of stillbirths + deaths in 1st week of life Total births NB These rates are usually related to one year
Infant mortality rate
Stillbirth rate
Perinatal mortality rate
2. Membanding kadar penyakit dan mengukur hubungkait (association)

Beberapa soalan Perlu dijawab. Adakah insiden penyakit tinggi? Adakah insiden berlaku akibat dari penyebab yang dijangka? Adalah insiden berubah setelah dilakukan penambahbaikan atau intervensi? Beberapa istilah perlu kita fahami bersama.
a)
Exposed/unexposed Population
Populasi yang terdedah kepada penyakit menpunyai resiko untuk mengidap penyakit tersebut. Begitu juga sebaliknya. Tetapi sejauh mana kenyataan ini benar? Ia mesti dibuktikan dengan perkiraan statistik.
Contoh: Pekerja kilang arang menpunyai resiko tinggi untuk mengidap barah paru-paru kerana mereka terdedah secara langsung dengan arang batu. Berbanding dengan orang yang tidak bekerja dikilang arang, mereka tidak mempunyai resiko barah paru-paru kerana tidak terdedah. Walaubagaimana pun tidak semestinya semua orang yang terdedah akan mengidap penyakit barah paru-paru. Begitu juga sebaliknya.
b)
Attributable risk
Adalah kadar penyakit orang yang terdedah di tolak dengan orang yang tidak terdedah kepada penyakit. Ia mengukur kadar pendedahan seseorang akan terkena penyakit apabila terdedah kepada faktor resiko. Contohnya, untuk menentukan attributable risk orang yang terlibat dalam aktiviti lasak seperti mendaki bukit sedangkan mendaki itu adalah satu sukan yang menyeronokkan.
c)
Relative risk Is the ratio of the disease rate in exposed persons to that in people who are unexposed. It is related to attributable risk by the formula: Attributable risk= rate of disease in unexposed persons x (relative risk-1) Relative risk is less relevant to making decisions in risk management than is attributable risk. For example, given a choice between a doubling in their risk of death from bronchial carcinoma and a doubling in their risk of death from oral cancer,
most informed people would opt for the latter. The relative risk is the same (two), but the corresponding attributable risk is lower because oral cancer is a rarer disease. Nevertheless, relative risk is the measure of association most often used by epidemiologists. One reason for this is that it can be estimated by a wider range of study designs. In particular, relative risk can be estimated from case-control studies. Whereas attributable risk cannot. Another reason is the empirical observation that where two risk factors for a disease act in concert, their relative risks often come close to multiplying. Closely related to relative risk is the odds ratio, defined as The odds of disease in exposed persons The odds of disease in unexposed persons.
Confounding
In an ideal laboratory experiment the investigator alters only one variable at a time, so that any effect he observes can only be due to that variable. Most epidemiological studies are observational, not experimental, and compare people who differ in all kinds of ways, known and unknown. If such differences determine risk of disease independently of the exposure under investigation, they are said to confound its association with the disease. For example, several studies have indicated high rates of lung cancer in cooks. Though this could be a consequence of their work (perhaps caused by carcinogens in fumes from frying), it may be simply because professional cooks smoke more than the average. In other words, smoking might confound the association with cooking.
Confounding determines the extent to which observed associations are causal. It may give rise to spurious associations when in fact there is no causal relation, or at the other extreme, it may obscure the effects of a true cause. Two common confounding factors are age and sex. Crude mortality from all causes in males over a five year period was higher in Bournemouth than in Southampton. However, this difference disappeared when death rates were compared for specific age groups (Table 3.2). It occurred not because Bournemouth is a less healthy place than Southampton but because, being a town to which people retire, it has a more elderly population.
Table 3.2 Deaths in males in Bournemouth and Southampton during a five year period Bournemouth Age group (years) <1 1-44 45-64 65+ All ages Southampton
Annual Annual No of death rate No of death rate deaths Population per 100 000 deaths Population per 100 000 116 204 1 252 4 076 5 648 919 34 616 19 379 11 760 66 674 2 524 118 1 292 6 932 1 694 223 332 1 728 3 639 5 922 1 897 64 090 24 440 9 120 99 547 2 351 104 1 414 7 980 1 190
1. Standardisation
The above example shows the dangers of drawing aetiological conclusions from comparisons of crude rates. The problem can be overcome by comparing age and sex specific rates as in Table 3.2, but the presentation of such data is rather cumbersome, and it is often helpful to derive a single statistic that summarises the comparison while allowing for differences in the age and sex structure of the
populations under study. Standardised or adjusted rates provide for this need. Two techniques are available:
a)
Direct standardisation Direct standardisation entails comparison of weighted averages of age and sex specific disease rates, the weights being equal to the proportion' of people in each age and sex group in a convenient reference population. Table 3.3 shows the method of calculation, based on mortality from coronary heart disease in men in the USA aged 35-64 during 1968. Table 3.4 gives standardised rates for men and women in the ensuing years, calculated in the same way, and shows a remarkable fall.
Table 3.3 Example of direct standardisation, based on mortality from coronary heart disease (CHD) in men in the USA aged 35-64, 1968 CHD deaths/100 000 (1) 93 355 961 % of reference population in age group (2) (1) x (2) 34.4 360 29.5 100 3 199.2 12 780.0 28 349.5 443 28.7 443 100 =
Age (years) 35-44 45-54 55-64 Total
Table 3.4 Coronary heart disease in American men and women aged 3564: changes in age standardised mortality (deaths/100 000/year) during 1968 - 1974 1968 Men Women 443 134 1969 430 126 1970 420 126 1971 413 124 1972 408 120 1973 399 118 1974 377 111
b)
Indirect standardisation The direct method is for large studies, and in most surveys the indirect method yields more stable risk estimates. Suppose that a general practitioner wants to test his impression of a local excess of chronic bronchitis. Using a standard questionnaire, he examines a sample of middle aged men from his list, and finds that 45 have persistent cough and phlegm. Is this excessive? The calculation is shown in.
Table 3.5 Example of indirect standardisation Age (years) 35-44 45-54 55-64 Total No in study Symptom prevalence in Expected cases = (1) (1) reference group (2) x (2) 150 100 90 8% 9% 10% 12 9 9 30 First the numbers of subjects in each age class are listed (column 1). The doctor must then choose a suitable reference population in which the class specific rates are known (column 2). (In mortality studies this would usually be the nation or some subset of it, such as a particular region or social class; in multicentre studies it could be the pooled data from all centres.) Cross multiplying columns 1 and 2 for each class gives the expected numberof cases in a group of that age and size, based on the reference population's rates. Summation over all classes yields the total expected frequency, given the size and age structure of that particular study sample. Where 30 cases were expected he has observed 45, giving
10
an age adjusted relative risk or standardised prevalence ratio of 45/30 = 150%. (Conventionally, standardised ratios are often expressed as percentages.) A comparable statistic, the standardised mortality ratio (SMR) is widely used by the registrar general in summarising time trends and regional and occupational differences. Thus in 1981 the standardised mortality ratio for death by suicide in male doctors was 172%, indicating a large excess relative to the general population at the time. To analyse time trends, as with the cost of living index, an arbitrary base year is taken.
c)
Other methods of adjusting for confounders The techniques of standardisation are usually used to adjust for age and sex, although they can be applied to control for other confounders. Other methods, which are used more generally to adjust for confounding, include mathematical modelling techniques such as logistic regression. These assume that a person's risk of disease is a specified mathematical function of his exposure to different risk factors and confounders. For example, it might be assumed that his odds of developing lung cancer are a product of a constant and three parameters - one determined by his age, one by whether he smokes, and the third by whether he has worked with asbestos. A computer program is then used to calculate the values of the parameters that best fit the observed data. These parameters estimate the odds ratios for each risk factor - age, smoking, and exposure to asbestos, and are mutually adjusted. Such modelling techniques are powerful and readily available to users of personal computers. They should be used with caution, however, as the mathematical assumptions in the model may not always reflect the realities of biology
11
Measurement error and bias

Epidemiological studies measure characteristics of populations. The parameter of interest may be a disease rate, the prevalence of an exposure, or more often some measure of the association between an exposure and disease. Because studies are carried out on people and have all the attendant practical and ethical constraints, they are almost invariably subject to bias. 1. Selection bias Selection bias occurs when the subjects studied are not representative of the target population about which conclusions are to be drawn. Suppose that an investigator wishes to estimate the prevalence of heavy alcohol consumption (more than 21 units a week) in adult residents of a city. He might try to do this by selecting a random sample from all the adults registered with local general practitioners, and sending them a postal questionnaire about their drinking habits. With this design, one source of error would be the exclusion from the study sample of those residents not registered with a doctor. These excluded subjects might have different patterns of drinking from those included in the study. Also, not all of the subjects selected for study will necessarily complete and return questionnaires, and non-responders may have different drinking habits from those who take the trouble to reply. Both of these deficiencies are potential sources of selection bias. The possibility of selection bias should always be considered when defining a study sample. Furthermore, when responses are incomplete, the scope for bias must be assessed. The problems of incomplete response to surveys are considered further in . 2. Information bias The other major class of bias arises from errors in measuring exposure or disease. In a study to estimate the relative risk of congenital malformations associated with maternal exposure to organic solvents
12
such as white spirit, mothers of malformed babies were questioned about their contact with such substances during pregnancy, and their answers were compared with those from control mothers with normal babies. With this design there was a danger that "case" mothers, who were highly motivated to find out why their babies had been born with an abnormality, might recall past exposure more completely than controls. If so, a bias would result with a tendency to exaggerate risk estimates. Another study looked at risk of hip osteoarthritis according to physical activity at work, cases being identified from records of admission to hospital for hip replacement. Here there was a possibility of bias because subjects with physically demanding jobs might be more handicapped by a given level of arthritis and therefore seek treatment more readily. Bias cannot usually be totally eliminated from epidemiological studies. The aim, therefore, must be to keep it to a minimum, to identify those biases that cannot be avoided, to assess their potential impact, and to take this into account when interpreting results. The motto of the epidemiologist could well be "dirty hands but a clean mind" (manus sordidae, mens pura). 3. Measurement error As indicated above, errors in measuring exposure or disease can be an important source of bias in epidemiological studies. In conducting studies, therefore, it is important to assess the quality of measurements. An ideal survey technique is valid (that is, it measures accurately what it purports to measure). Sometimes a reliable standard is available against which the validity of a survey method can be assessed. For example, a sphygmomanometer's validity can be measured by comparing its
13
readings with intraarterial pressures, and the validity of a mammographic diagnosis of breast cancer can be tested (if the woman agrees) by biopsy. More often, however, there is no sure reference standard. The validity of a questionnaire for diagnosing angina cannot be fully known: clinical opinion varies among experts, and even coronary arteriograms may be normal in true cases or abnormal in symptomless people. The pathologist can describe changes at necropsy, but these may say little about the patient's symptoms or functional state. Measurements of disease in life are often incapable of full validation. In practice, therefore, validity may have to be assessed indirectly. Two approaches are used commonly. A technique that has been simplified and standardized to make it suitable for use in surveys may be compared with the best conventional clinical assessment. A self administered psychiatric questionnaire, for instance, may be compared with the majority opinion of a psychiatric panel. Alternatively, a measurement may be validated by its ability to predict future illness. Validation by predictive ability may, however, require the study of many subjects.
Analysing validity
When a survey technique or test is used to dichotomise subjects (for example, as cases or non-cases, exposed or not exposed) its validity is analysed by classifying subjects as positive or negative, firstly by the survey method and secondly according to the standard reference test. The findings can then be expressed in a contingency table as shown below.
Table 4.1 Comparison of a survey test with a reference test Survey test result Reference test result Positive Negative Totals
14
Positive
True positives, correctly identified = (a) False negatives = (c) Total true positives = (a + c)
False positives = (b) True negatives correctly identified = (d) Total true negatives = (b + d)
Total test positives = (a + b) Total test negatives = (c + d) Grand total = (a + b + c + d)
Negative
Totals
From this table four important statistics can be derived: Sensitivity - A sensitive test detects a high proportion of the true cases, and this quality is measured here by a/a + c. Specificity- A specific test has few false positives, and this quality is measured by d/b + d. Systematic error - For epidemiological rates it is particularly important for the test to give the right total count of cases. This is measured by the ratio of the total numbers positive to the survey and the reference tests, or (a + b)/(a + c). Predictive value-This is the proportion of positive test results that are truly positive. It is important in screening. It should be noted that both systematic error and predictive value depend on the relative frequency of true positives and true negatives in the study sample (that is, on the prevalence of the disease or exposure that is being measured).
1. Sensitive or specific? A matter of choice

If the criteria for a positive test result are stringent then there will be few false positives but the test will be insensitive. Conversely, if criteria are relaxed then there will be fewer false negatives but the test will be less specific. In a survey of
15
breast cancer alternative diagnostic criteria were compared with the results of a reference test (biopsy). Clinical palpation by a doctor yielded fewest false positives (93% specificity), but missed half the cases (50% sensitivity). Criteria for diagnosing "a case" were then relaxed to include all the positive results identified by doctor's palpation, nurse's palpation, or xray mammography: few cases were then missed (94% sensitivity), but specificity fell to 86%. By choosing the right test and cut off points it may be possible to get the balance of sensitivity and specificity that is best for a particular study. In a survey to establish prevalence this might be when false positives balance false negatives. In a study to compare rates in different populations the absolute rates are less important, the primary concern being to avoid systematic bias in the comparisons: a specific test may well be preferred, even at the price of some loss of sensitivity
Jenis-jenis Kajian Statistik dan Design Kajian

Kajian statistik klinikal mempunyai terminology tertentu yang mesti di fahami oleh semua para penyelidik. Ia terbahagi kepada dua kategori besar 3:
1. Experimental Studies:
The hallmark of the experimental study is that the allocation or assignment of individuals is under control of investigator and thus can be randomized. The key is that the investigator controls the assignment of the exposure or of the treatment but otherwise symmetry of potential unknown confounders is maintained through randomization. Properly executed experimental studies provide the strongest empirical evidence. The randomization also provides a better foundation for statistical procedures than do observational studies.
a)
Randomized Controlled Clinical Trial (RCT):

A prospective, analytical, experimental study using primary data generated in the clinical environment. Individuals similar at the beginning
16
are randomly allocated to two or more treatment groups and the outcomes the groups are compared after sufficient follow-up time. Properly executed, the RCT is the strongest evidence of the clinical efficacy of preventive and therapeutic procedures in the clinical setting.
b)
Randomized Cross-Over Clinical Trial:

A prospective, analytical, experimental study using primary data generated in the clinical environment. Individuals with a chronic condition are randomly allocated to one of two treatment groups, and, after a sufficient treatment period and often a washout period, are switched to the other treatment for the same period. This design is susceptible to bias if carry over effects from the first treatment occur. An important variant is the "N of One" clinical trial in which alternative treatments for a chronically affected individual are administered in a random sequence and the individual is observed in a double blind fashion to determine which treatment is the best.
c)
Randomized Controlled Laboratory Study:

A prospective, analytical, experimental study using primary data generated in the laboratory environment. Laboratory studies are very powerful tools for doing basic research because all extraneous factors other than those of interest can be controlled or accounted for (e.g., age, gender, genetics, nutrition, environment, co-morbidity, strain of infectious agent). However, this control of other factors is also the weakness of this type of study. Animals in the clinical environment have a wide range of all these controlled factors as well as others that are unknown. If any interactions occur between these factors and the outcome of interest, which is usually the case, the laboratory results are not directly applicable to the clinical setting unless the impact of these interactions are also investigated.
17
2. Observational Studies:
The allocation or assignment of factors is not under control of investigator. In an observational study, the combinations are self-selected or are "experiments of nature". For those questions where it would be unethical to assign factors, investigators are limited to observational studies. Observational studies provide weaker empirical evidence than do experimental studies because of the potential for large confounding biases to be present when there is an unknown association between a factor and an outcome. The symmetry of unknown confounders cannot be maintained. The greatest value of these types of studies (e.g., case series, ecologic, case-control, cohort) is that they provide preliminary evidence that can be used as the basis for hypotheses in stronger experimental studies, such as randomized controlled trials.
a)
Cohort (Incidence, Longitudinal Study) Study:

A prospective, analytical, observational study, based on data, usually primary, from a follow-up period of a group in which some have had, have or will have the exposure of interest, to determine the association between that exposure and an outcome. Cohort studies are susceptible to bias by differential loss to follow-up, the lack of control over risk assignment and thus confounder symmetry, and the potential for zero time bias when the cohort is assembled. Because of their prospective nature, cohort studies are stronger than case-control studies when well executed but they also are more expensive. Because of their observational nature, cohort studies do not provide empirical evidence that is as strong as that provided by properly executed randomized controlled clinical trials.
18
b)
Case-Control Study:
A retrospective, analytical, observational study often based on secondary data in which the proportion of cases with a potential risk factor are compared to the proportion of controls (individuals without the disease) with the same risk factor. The common association measure for a casecontrol study is the odds ratio. These studies are commonly used for initial, inexpensive evaluation of risk factors and are particularly useful for rare conditions or for risk factors with long induction periods. Unfortunately, due to the potential for many forms of bias in this study type, case control studies provide relatively weak empirical evidence even when properly executed.
c)
Ecologic (Aggregate) Study:

An observational analytical study based on aggregated secondary data. Aggregate data on risk factors and disease prevalence from different population groups is compared to identify associations. Because all data are aggregate at the group level, relationships at the individual level cannot be empirically determined but are rather inferred from the group level. Thus, because of the likelihood of an ecologic fallacy, this type of study provides weak empirical evidence.
d)
Cross-Sectional (Prevalence Study) Study:

A descriptive study of the relationship between diseases and other factors at one point in time (usually) in a defined population. Cross sectional studies lack any information on timing of exposure and outcome relationships and include only prevalent cases.
19
e)
Case Series:
A descriptive, observational study of a series of cases, typically describing the manifestations, clinical course, and prognosis of a condition. A case series provides weak empirical evidence because of the lack of comparability unless the findings are dramatically different from expectations. Case series are best used as a source of hypotheses for investigation by stronger study designs, leading some to suggest that the case series should be regarded as clinicians talking to researchers. Unfortunately, the case series is the most common study type in the clinical literature.
f)
Case Report:
Anecdotal evidence. A description of a single case, typically describing the manifestations, clinical course, and prognosis of that case. Due to the wide range of natural biologic variability in these aspects, a single case report provides little empirical evidence to the clinician. They do describe how others diagnosed and treated the condition and what the clinical outcome was.
20
BAB III. Membuat Kajian Saintifik

Research Objective
1. Definition
A statement that describes what is to be achieved. (Tells what you want the study to do or you) RESEARCH OBJECTIVE It determines your options for the subsequent steps in the study design. Importance of research objective are facilitate the development of research methodology and other components of research process assist in focusing the study avoid unnecessary data collection
2. Classification of research objectives

General and Specific
a)
General Objective
states what is expected to be achieved by the study in general terms. General Objectives usually include:problem verification and analysis of the the causes of the problem Example: The study aims at assessing the relative importance of patients attitudes, knowledge of the health personnel about early symptoms and availability of diagnostic facilities as causes of delay in diagnosing pre-eclampsia.
b)
Specific Objectives
The break-down of general objective into smaller, logically connected parts. Systematically address various aspects of the problem. Also quantifying of problems distribution. Identification of possible contributory factors and expectation at end of study (WHAT THE STUDY HOPES TO ACCOMPLISH ) Specific Objectives have to be: concrete, and specific .Please use action verbs.
21
Example 1. The study aims at establishing the frequency of cancellation of elective operations. 2. The study aims at comparing the use of ultrasound in prenatal diagnostics by junior and senior physicians. 3. The study aims at computing the correlation between the patients age and the length of stay. Formulation of objectives Refer to problem analysis diagram (loose term as buble chart). Its cover the different aspects of the problem. And write/state in a logic sequence clearly phrased in operational terms (WHAT, WHERE and WHY) . Be realistic and not ambitious. It also must be measurable and use action verbs (to determine, compare, describe, establish, calculate )
Study Hypothesis
A statement that predicts the relationship between one or more factors and the problems being studied and test the prediction
How to search meterials

Mencari sumber rujukan untuk menyelidik amat senang di zaman sekarang. Kita sudah perlu lagi ke perpustakaan besat untuk mencari bahan yang dikehendaki. Semua maklumat sudah ada dihujung jari, cuma masalahnya kita pandai mencari atau tidak. Suka saya bahagikan cara pencarian bahan rujukan kepada beberapa bahagian.
1. Buku atau majalah

Boleh kita dapati di perpustakaan awam atau secara spesifiknya di perpustakaan perubatan. Kita akan memperolehi sumber rujukan ini secara hard copy.
2. Journals
Begitu juga dengan jurnal perubatan, boleh kita dapati dari perpustakaan yang melanggani jurnal tersebut. Kadang-kadang sesetengah perpustakaan tidak melanggani banyak jurnal. Menyebabkan pencarian bahan rujukan kita terhad.
Buku Panduan Kajian Saintifik, Statistik dan Pengenalan SPSS 22
Tapi jangan bimbang, kebanyakan jurnal yang tersohor sekarang ini boleh kita dapati melalui laman web mereka. Ada yang boleh mendapat bahan secara percuma dan ada yang kena berlanggan.
3. Internet
Pencarian bahan rujukan melalui internet telah mengubah kehidupan pengkaji atau penyelidik. Sebelum ini mereka mereke bergantung kepadaperpustakaan besar atau perpustakaan perubatan di Universiti Perubatan. Tetapi sekarang dengan menggunakan hujung jari dan duduk depan komputer yang berinternet, penyelidik boleh mencari bahan rujukan dengan sekelip mata sahaja. Di sini turunkan secara ringkas bagaimana cara mencari bahan rujukan melalui internet.
a)
Laman web yang biasa dilawati untuk mendapat bahan
rujukan perubatan.
Name American Family Physician Archive of Family Medicine Australian Family Physician British Medical Journal Canadian Medical Journal Centre of EBM CPG Singapore McMaster EBM PubMed Central http://www.cebm.net/ http://moh.gov.sg/pub/cpg/cpg.htm http://www.cche.net/principles/content_all.asp http://www.pubmedcentral.nih.gov/ http://www.cmaj.ca/ http://bmj.bmjjournals.com/ http://www.racgp.org.au/publications/afp_online.asp http://www.racgp.org.au/ Site http://www.aafp.org/
23
CPG New Zealand Global Family Doctor Malaysian Medical Association PCDOM Primary Care Clinical Guidelines SIGN Postgraduate Medicine Mayo Clinic AcadMed Malaysia Malaysian Medical Resources WHO Bandolier National Guideline Clearinghouse Free Medical Journals Evidence-based Nursing Journal of Community Nursing
http://www.nzgg.org.nz/library.cfm http://www.globalfamilydoctor.com/ http://www.mma.org.my/
http://www2.jaring.my/pcdom/ http://medicine.ucsf.edu/resources/guidelines/indexalp ha.html/ http://www.sign.ac.uk/ http://www.postgradmed.com/
http://www.mayoclinic.com/index.cfm/ http://www.acadmed.org.my/ http://mimed.cjb.net/
http://www.who.int/home-page/ http://www.jr2.ox.ac.uk/bandolier/ http://www.guideline.gov/index.asp
http://www.freemedicaljournals.com/ http://ebn.bmjjournals.com/contents-by-date.0.shtml
http://www.jcn.co.uk/journal.asp?showArt=no
b)
Other Very useful Website.

Name Site
Resource
24
Resource
Name
Site http://www.google.com/
Good internet search engine Goggle with better safety features (Google also offer a desktop search of all material in PC) Free, search-based webmail service with large storage Gmail 1,000 megabytes (1 gigabyte) storage Software that helps you instantly find, edit and share all the piclures on your PC. Picasa - A free software download from Google Good search engine for scientific (academic) papers & results Free resource for Physicians, with customised CME, medical journal articles, MEDLINE search, medical news, etc Customised new references Biomail from MEDLINE to your email account Up-to-date, accurate information about effects of healthcare. Systematic Reviews of helthcare National Library of Medicine Cochrane Collaboration Medscape (also for medline search) Goggle Scholar
http://gmail.google.com/
http: //www. picasa . com/
http://scholar.google.com/
http://www.medscape.com/
http://www.biomail.org/
http://www.coch ra ne .org
http://www.nlm.nih.gov/database/
25
Resource interventions.
Name UK Evidence-Based Medicine Centre Canadian Evidencebased medicine
Site http://cebm.jr2 ox.ac.uk/
http://www.cche.net/
Web page run by colleagues Malaysian Medical to support colleagues & clients (Alan Teh aka Palmdoc, TE Cheah) Resources (also connects with Palm doc & gives ideas how to use a plam)
http://medicine.com. my/2005_01 JD1 mmrarchive.html http://palmdoc.blogspot.com/
4. Compact Disc
Boleh didapati apabila anda membeli buku rujukan seperti, Harrison medical Textbook.
5. Medline
Pencarian maklumat melalui medline adalah pencaraian yang paling berjaya setakat ini. Ini adalah kerana medline boleh mencari perkataan perubatan yang spesifik dan menepati kehendak pengguna. Anda boleh menjana pencarian medline secara percuma melalui laman web Pubmed Central dari National Center for Biotechnology Information. Ia mengandungi bahan jurnal yang percuma, lebih 300,000 bahan dari 150 jurnaldalam talian. Terpenting di sini adalah ia percuma untuk orang awam. Sila lawatlaman web ini untuk anda memulakan pencarian. (http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PMC)
26
Layout pencarian seperti berikut
Contoh pencarian MEDLINE

6. Ovid dan Proquest

Ovid dan Proquest juga laman web online yang muktabar. Akan tetapi pencarian melalui Ovid atau Proquest tidak percuma. Anda mesti melangganinya. Biasanya institusi besar seperti Universiti akan berlanggan dengan perkhidmatan Ovid dan Proquest ini.
a) b)
Ovid: http://gateway.ovid.com/ Proquest: http://proquest.umi.com/login
How to do study
Epidemiological surveys use various study designs and range widely in size. At one extreme a case-control investigation may include fewer than 50 subjects, while at the other; some large longitudinal studies follow up many thousands of people for several decades. The main study designs will be described in later chapters, but we here discuss important features that are common to the planning and execution of surveys, whatever their specific design.
28
1. Early planning The success of data collection requires careful preparation. The first and often the most difficult question is "Why am I doing this survey?" Many studies start with a general hope that something interesting will emerge, and they often end in frustration. The general interest has first to be translated into precisely formulated, written objectives. Every survey should be reasonably sure to give an adequate answer to at least one specific question. This initial planning requires some idea of the final analysis; and it may be useful at the outset to outline the key tables for the final report, and to consider the numbers of cases expected in their major cells. Every study needs a primary purpose. It is easy to argue "While we have the subjects there, let's also measure..."; but overloading, whether of investigators or subjects, must be avoided if it in any way threatens the primary purpose. Sometimes subsidiary objectives may be pursued in subsamples (every nth subject, or in a particular age group) or by recalling some subjects for a second examination: when their initial contact has been favourable then response to recall is usually good. 2. Background reading Before planning the detail of a study, it is wise to carry out a library search of the relevant background publications. Occasionally this may show the answer to the study question without any need for further data collection; or it may uncover useful sources of published information, such as the registrar general's mortality and cancer registry reports, which can form the basis of an analysis without the requirement for an expensive and time consuming field survey. Even when survey work remains necessary, experience in earlier related investigations may guide the design or indicate pitfalls to be avoided.
29
3. Choice of examination methods

The overriding need in an epidemiological survey is to examine a representative sample of adequate size in a standardised and sufficiently valid way. This determines the choice of examination methods and the points where these differ from those of clinical practice. Methods must be acceptable, and if possible noninvasive, or else cooperation suffers and the study group becomes unrepresentative. They must be relatively cheap and quick, or not enough subjects can be examined: with fixed resources the need for detail conflicts with the need for numbers. Most important of all, methods and observers must be capable of rigorous standardisation; even if this excludes the benefits of clinical judgement.
4. Information abstracted from existing records

Sometimes adequately standardised information is already available from existing records. For example, in a study to examine the long term incidence of hypothyroidism after treatment with radioiodine for thyrotoxicosis, it was possible to identify treated patients and obtain the information needed to follow them up (name, date of birth, sex, address, etc) by searching hospital files. When existing records are exploited in this way, the required information is normally abstracted on to a specially designed form or even direct on to a portable computer. The design of the abstraction form or of the computer program for inputting data should take into account the layout of the source material. Having to flick repeatedly backwards and forwards through the source record is not only tedious and time consuming, but may also increase the chance of error. Each abstracted record should be identified by a serial number, and should include sufficient information to permit easy access back to the source material for checking and to obtain additional data if required. When data are not abstracted direct on to computer, later transfer to computer will often be facilitated by numerical coding, in which case coding boxes can be provided on the right hand side of the abstraction form. Some items of data (for example, dates of birth) can easily be
30
written direct into the coding boxes. Others, such as occupation, may need to be recorded in words and coded later as a separate exercise. Time spent writing is minimised if non-numerical information is, when possible, ringed or ticked rather than having to be written out. To minimise the chance of error, any reformulation of numerical data (for example, derivation of age at hospital admission from date of birth and date of admission) should be carried out by the computer after date entry, and not as part of the abstraction process. When coding data, allowance must be made for the possibility of missing information.
5. Questionnaires
Epidemiological data are often obtained by means of questionnaires. These may be either self administered (that is, completed by the subject) or administered at interview. Self administered questionnaires are easier to standardise because the possibility of systematic differences in interviewing technique is avoided. On the other hand, they are limited by the need to be unambiguously understood by all subjects. An interviewer may be essential to collect information on complex topics. Good design of questionnaires requires skill. The language used should be clear and simple. Two short questions, each covering one point, are better than one longer question which covers two points at once. A question that has been used successfully in a previous study has obvious advantages. The order of questions should take into account the sensitivities of the person to whom they are addressed - it is better to start with "What is your date of birth?" than launch straight into "Have you ever been treated for gonorrhoea?" - and should be designed to facilitate recall. For example, all questions relating to one phase of the person's life might be grouped together. As a check on the reliability of information, it may sometimes be helpful to include overlapping questions. In a study of risk factors for back pain, some people reported that their jobs entailed driving for more than four hours a day but did not involve more than two hours sitting. This suggests that they had not properly understood the questions. An
important consideration is whether to use closed or open ended questions. Closed ended questions, with one box for each possible answer (including "don't know") are more readily answered and classified, but cannot always collect information in the detail that is required. When interviewers are used then the wording with which they ask questions should be standardised as far as is compatible with the need to obtain useful information. As in abstracting existing records, the forms used to record answers to questions should be designed for ease and accuracy of completion and to simplify subsequent coding and analysis.
6. Physical examination and clinical investigations

Methods of physical examination should be designed to reduce variation within and between observers. Often, a quantitative measurement (for example, respiratory rate) is easier to standardise than a qualitative judgement (whether someone is tachypnoeic or not). Standardisation of laboratory assays can be improved by careful specification of the method by which specimens should be collected and stored and by rigorous quality control of the analysis. Whatever method of data collection is adopted, it is usually worth trying it out in a pilot survey before embarking on the main study. Identification of practical snags at this stage can save much difficulty later. In large studies the questionnaire or record design should be discussed with the statistician who will later be concerned in the analysis.
7. Staff and training

In a small study the doctor himself may do all the work, but in large surveys he will need helpers. If an epidemiological examination technique requires skill and clinical judgement it has probably been insufficiently standardised: if it is adequately standardised it can usually be taught to any intelligent person. The figure shows how two observers had distinct but opposite time trends in their performances during the early stages of a survey of skinfold thickness. Such training effects, which are common, should have been completed before the start
32
of the main study: new staff need supervised practice under realistic field conditions followed by pre-survey testing.
Despite all precautions, observer differences may persist. Observers should therefore be allocated to subjects in a more or less random way: if, for example, one person examined most of the men, and another most of the women, then observer differences would be confounded with true sex differences. To maintain quality control throughout the survey each examiner's identity should be entered on the record, and results for different examiners may then be compared.
How To Make Study Topic

Bagaimana ingin membuat kajian dan memilih tajuk. Saya belajar melalui cara yang di tunjuk oleh Prof Khoo. Dengan menggunakan teknik PICO. P Patient population characteristic. I Intervention approach C Comparison maneuver O Outcome Track down relevant articles Critically apraised Determine applicability of results.
Contoh 1: Di klinik anda mempunyai masalah asthma yang tinggi di kalangan kanak-kanak pada tahun lepas. Bagaimana ingin menurunkan kadar kejadian kes pada tahun ini. Jadi
33
cara nak menjadikan ia menjadikan ia satu tajuk kajian dengan menggunakan kaedah PICO ini adalah seperti berikut: P= Kanak-kanak asthma yang datang ke Klinik Kesihatan Permaisuri. I= Pendidikan kesihatan C= Control dengan pesakit yang tidak mendapat pendidikan kesihatan. O= Megurangkan insiden athma pada tahun ini. Mencari bahan yang berkaitan dengan asthma dan kaedah pendidikan kesihatann serta keberkesanannya. Mencari pakar rujuk yang boleh dibawa berbincang dan mengkritik proposal kajian kita. Mempastikan kajian yang akan dilakukan bersesuaian dengan keadaan semasa.
Topik yang diolah: Case control study about Asthma Education among children attending Permaisuri Health Center From Jun to December 2008.
Contoh 2 Diperhatikan jumlah neonatal jaundice tahun lepas naik secara mendadak di Daerah Setiu. Apakah tajuk kajian yang sesuai dilakukan. Penyiasatan ringkas kita menunjjukan peningkatan tersebut akibat dari lambat notifikasi kes. Adaskah ia benar wujud atau tidak, belum pasti. Kita juga ingin mengetahui faktor lain yang mungkin terlibat. Berdasarkan mnemonik yang diberikan di atas kita boleh mengolah tajuk kajian yang kita akan lakukan. P= Neonate yang dilahirkan di Daerah Setiu I= Tiada C= Semua jumlah kelahiran semasa tarikh kajian . O= Mengetahui faktor sebenar punca NNJ Tajuk yang boleh diolah: Cross Sectional study on High Incidence of Neonatal Jaundice in District Setiu from June to December 2008.
34
Pengumpulan Data
1. Jenis Data
a)
Data Primer
Data yang didapati daripada sumber asal, iaitu diukur secara langsung dari populasi asal. Didapatkan dengan pemeriksaan atau pengukuran langsung, temubual atau rekod kesihatan.
b)
Data Sekunder
Data yang didapati daripada sumber kedua yang mengalami pengolahan seperti laporan tahunan.
2. Jenis Variabel
Dependent and Independent variables
In most causeand effectstudy the researcher is looking at the relationship between Independent variables and the dependent variables the effect/outcome dependent variable the cause is an independent variable example: in a survey to investigate whether there is a relationship between mothers smoking cigarettes and weight of newborn: The dependent variable is the newborns weight The independent variable is the mothers smoking habit If a researcher looks for a causal explanation,the characteristic of problem under study may be called the DEPENDENT VARIABLE OR OUTCOME The INDEPENDENT VARIABLES are the characteristics of factors that are assumed to cause or influence the problem
35
The variables to be studied are selected on the basis of their relevance to the objectives of the study. A variable may be Indepedent or dependent according to the objectives of the study.The decision on which variables are IV or DV follows from the statement of the problem example 1: if a researcher investigates whether smoking causes lung cancer, therefore ,lung cancer is the dependent variable But if he investigate why people smoke, then smoking would be the dependent variable. example 2: if a researcher wants to investigates whether the poor quality of hospital food influence patient for not taking fhe hospital diet,then the number of patients not taking foodsis the dependent variable But if he wants to investigate why poor quality of hospital food ,then the quality of food would be the dependent variable. The problem to be studied is measured in terms of dependent variables. The factors influencing the problem are measured in terms independent variables. Confounding variable A variable that is associated with the problem and with the possible cause of a problem. This variable may either strengthen or weaken the apparent relationship between an outcome and a possible cause
Cause (Independent)
Effect/outcome (dependent)
Other factors (Confounding) example: A relationship is shown between the low level of the mothers Education (IV) and malnutrition (DV)in under-fives. However family income is related to the mothers education as well as with malnutrition
36
3. Pengukuran Variebel
a)
Kualitatif
Dikategorikan berdasarkan sifat atau ciri yang membezakannya. Seperti; Etnik : Melayu, Cina, India dll. Boleh dibahagikan lagi kepada jenis nominal & Ordinal. Nominal- Tidak mempunyai nilai urutan/susunan tertentu seperti etnik M, C, I & L Ordinal- Ada nilai susunan atau aturan tertentu antara kategori. Jarak nilai antara kategori tidak di ketahui seperti pangkat jawatan. Nama lain: Kategorikal data.
b)
Kuantitatif
Hasil cerapannya berangka dan didapatkan dengan mengukur, menyukat atau membilang. Terdiri dari 2 jenis Diskret -Hasil dari membilang, dalam angka bulat. Cth bilangan anak/isteri. Selanjar -Boleh mengambil nilai pecahan, hasil dari pengukuran seperti tekanan darah, paras hemoglobin. Nama lain: Numerical data
4. Defining the variables

Once the variables have been selected each of them should be clarified Two aspects need to be considered:
a)
Conceptual definition
To define as it is concieved. example: obesity is defined as excessive fatness, overweight etc.
37
b)
Operational definition
(Working definition) the characteristics the investigator will actually measure example: obesity is defined as a weight based on weighting in under clothes and without shoes. other example in appendix 1 Operational definition of a variable forces the investigator to consider practicability Example in Appendix 2 a number of questions which may arise when attempting to define variables
Sampling Method
1. Sample size
Most surveys and trials are smaller than the investigator would wish, lack of numbers often setting a limit to some desirable subgroup analysis. This is inevitable. What can be avoided is discovering only at the final analysis that numbers do not permit achievement even of the study's primary objective. To prevent this disappointment the purpose of the study has first to be formulated in precise statistical terms. If the aim is to estimate prevalence, then sample size will depend on the required accuracy of that estimate. (Table 5.1 gives some examples.) Sampling error is proportionally greater for less common conditions; that is to say, to achieve the same level of confidence requires a larger sample if prevalence is low. Table 5.1 95% confidence limits for various rates and sample sizes 95% confidence limits Estimated prevalence (%) 2 n=500 1.0-3.7 n=1000 1.2-3.1
38
10 20
7.5-13.0 16.6-23.8
8.2-12.0 17.6-22.6
Techniques also exist for calculating sample sizes required for estimating, with specified precision, the mean value of a variable, or for identifying a given difference in prevalence or mean values between two populations. These techniques may be found in textbooks or (better) by consulting a statistician; but either way the investigators must first know exactly what they want to achieve.
2. Sampling methods
There are 2 main terms categories in doing sampling: o Non-probability Sampling o Probability Sampling
a)
Non-probability Sampling
Convenience Sampling Sample that happens to be available at the time or period of the research is selected, for conveniences sake. The sample may not be representative for the population understudied Quota Sampling The sample choose from the available source at that time until the investigator quota fulfilled. This method only useful when the convenience sample would not provide the desired balance of elements in population
Note: Non-probability Sampling is unable to quantify variable and generalize the finding to the population
b)
Probability Sampling
Employs random sampling procedures to ensure that the sampling unit is selected on the basis of chance. Every member of the population have a known chance of being included in the sample
39
3. How to do Sampling
a)
Matching:
When confounding cannot be controlled by randomization, individual cases are matched with individual controls that have similar confounding factors, such as age, to reduce the effect of the confounding factors on the association being investigated in analytic studies. Most commonly seen in case-control studies.
b)
Restriction (Specification):
Eligibility for entry into an analytic study is restricted to individuals within a certain range of values for a confounding factor, such as age, to reduce the effect of the confounding factor when it cannot be controlled by randomization. Restriction limits the external validity (generalizability) to those with the same confounder values.
c)
Census:
A sample that includes every individual in a population or group (e.g., entire herd, all known cases). A census not feasible when group is large relative to the costs of obtaining information from individuals.
d)
Haphazard, Convenience, Volunteer, Judgmental Sampling: Any sampling not involving a truly random mechanism. A hallmark of this form of sampling is that the probability that a given individual will be in the sample is unknown before sampling;. The theoretical basis for statistical inference is lost and the result is inevitably biased in unknown ways. Despite their best intentions, humans cannot choose a sample in a random fashion without a formal randomizing mechanism.
40
e)
Consecutive (Quota) Sampling:

Sampling individuals with a given characteristic as they are presented until enough with that characteristic are acquired. This method is okay for descriptive studies but unfortunately not much better than haphazard sampling for analytical observational studies.
f)
Random Sampling:
Each individual in the group being sampled has a known probability of being included in the sample obtained from the group before the sampling occurs.
g)
Simple Random Sampling / Allocation:

Sampling conducted such that each eligible individual in the population has the same chance of being selected or allocated to a group. This sampling procedure is the basis of the simpler statistical analysis procedures applied to sample data. Simple random sampling has the disadvantage of requiring a complete list of identified individuals making up the population (the list frame) before the sampling can be done.
h)
Stratified Random Sampling:

The group from which the sample is to be taken is first stratified on the basis of a important characteristic related to the problem at hand (e.g., age, parity, weight) into subgroups such that each individual in a subgroup has the same probability of being included in the sample but the probabilities are different between the subgroups or strata. Stratified random sampling assures that the different categories of the characteristic that is the basis of the strata are sufficiently represented in the sample but the resulting data must be analyzed using more
41
complicated statistical procedures (such as Mantel-Haenszel) in which the stratification is taken into account.
i)
Cluster Sampling:
Staged sampling in which a random sample of natural groupings of individuals (houses, herds, kennels, households, stables) are selected and then sampling all the individuals within the cluster. Cluster sampling requires special statistical methods for proper analysis of the data and is not advantageous if the individuals are highly correlated within a group (a strong herd effect).
j)
Systematic Sampling:
From a random start in first n individuals, sampling every nth animal as they are presented at the sampling site (clinic, chute, ...). Systematic sampling will not produce a random sample if a cyclical pattern is present in the important characteristics of the individuals as they are presented. Systematic sampling has the advantage of requiring only knowledge of the number of animals in the population to establish n and that anyone presenting the animals is blind to the sequence so they cannot bias it.
4. Recruiting subjects
Most people are willing to take part in medical surveys provided that they trust the investigators, just as patients will nearly always help their own doctors in their research. In population studies, however, there has usually been no previous contact. The selected subjects need an explanation of the purpose of the study, of why they in particular have been asked to take part, of what is expected from them, and what if anything they will get out of it (for instance a medical check up or a report on the research findings). Local general practitioners, too, need to
42
know what is going on. Time given to preparatory public relations is always well spent. Response must be made as easy as possible. If attendance at a centre is required, it is better to send everyone a provisional appointment than to expect them to reply to a letter asking whether they are willing to attend. Provision of transport may be welcomed. Often the difference between a mediocre response and a good one is tactful persistence, including second invitations (perhaps by recorded delivery), telephone calls, identifying the reasons for non-attendance, and home visits.
5. Response rates
The level of response that is acceptable depends both on the study question and on the population in which the question is being asked. Problems arise because non-responders may be atypical. For example, in a survey of coronary risk factors among adults registered with a group practice, those at highest risk may be the least inclined to complete a questionnaire or attend for examination. If a response rate of 85% were achieved, an estimated prevalence of heavy alcohol consumption of 3% among the responders could be substantially too low if most of the nonresidents drank heavily. On the other hand an estimated 50% prevalence of smokers would not need major revision, even if all of the nonresponders smoked. What matters is how unrepresentative non-responders are in relation to the study question. It is not important whether they are atypical in other respects. In a survey to evaluate the association between serum IgE concentrations and ventilatory function it would not matter if non-responders had an unusually high frequency of respiratory disease, provided that the relation of their ventilatory function to IgE was not unrepresentative. Assessment of the likely bias resulting from incomplete response is ultimately a matter of judgement. However, two approaches may help the assessment. Firstly, a small random sample can be drawn from the non-responders, and particularly
vigorous efforts made to encourage their participation, including home visits. The findings for this subsample will then indicate the extent of bias among nonresponders as a whole. Secondly, some information is generally available for all people listed in the study population. From this it will be possible to contrast responders and non-responders with respect to characteristics such as age, sex, and residence. Differences will alert the investigator to the possibility of bias. In addition, it may help to put absolute bounds on the uncertainty arising from non-response by making extreme assumptions about the non-responders. For example, if the aim of a survey were to estimate a disease prevalence, what would be the prevalence if all of the non-responders had the disease, or none of them?
6. Analysis
Small studies can sometimes be analysed manually with the help of a calculator. Nowadays, however, the analysis of epidemiological data is almost always carried out by computer. With recent advances in technology, all but the largest data sets can be handled satisfactorily on a personal computer. Moreover, a wide range of software packages is now available to assist epidemiological analysis. The starting point for analysis by computer is the coding and entry of data. These procedures should be checked, usually by carrying them out in duplicate. In addition, once the data have been entered, further checks should be made to ensure that all codes are valid (for example, nobody should have 31 February as a birth date) and to look for any internal inconsistencies (such as a date of admission to hospital being earlier than the subject's date of birth). Statistical analysis should only begin when the data set is as "clean" as possible. With the ready availability of software packages, it is tempting for medical investigators to embark on analyses they do not fully understand, and in the process they may use inappropriate statistical techniques. For this reason it is preferable to obtain advice from a statistician when carrying out all but the simplest analyses. As with the earlier stages of data processing, statistical calculations should all be checked.
Statistik Inferens
Apabila kita melakukan sesuatu penyelidikan, kita mahu membuat sesuatu inferens dari data yang terkumpul, contohnya "Ubat A lebih baik dari ubat B dalam merawat sesuatu penyakit C" maka Hipotesis Nul akan berbunyi seperti berikut; "tiada perbezaan keberkesanan di antara ubat A dengan ubat B dalam merawat penyakit C" jadi apabila dilakukan statistik inferens, dapat ditentukan sama ada wujud atau tidak perbezaan yang signifikan dari segi keberkesanan di antara ubat A dan ubat B. Jika wujud perbezaan yang bermakna, maka hipotesis nul akan ditolak, iaitu wujud perbezaan keberkesanan yang signifikan antara 2 ubat tersebut (p<0.05). Sebaliknya jika tidak wujud perbezaan yang bermakna, maka hipotesis nul tidak ditolak iaitu tiada perbezaan keberkesanan yang signifikan di antara ubat A dengan ubat B dalam merawat penyakit C (p>0.05). Biasa batas kemaknaan yang digunakan sama ada untuk menolak atau tidak hipotesis nul ditentukan pada 0.05 atau 0.01. Bagi contoh di atas ia ditentukan pada 0.05. Selang keyakinan pula ialah 1-batas kemaknaan. Jika batas kemaknaan 0.05 maka selang keyakinan adalah 95%. Cara mengira nilai p kiraan bagi statistik inferens akan dijelaskan selepas ini.
45
1. Statistik Inferens Bagi Data Kuantitatif

Nota: Bagi tujuan pengajaran, perisian SPSS 12 .0 digunakan sebagai perisian contoh. Namun begitu terdapat banyak lagi perisian lain yang boleh digunakan seperti EpiInfo, SAS, MiniTab, Statistica, QStat dan sebagainya. SPSS boleh menerima data dalam pelbagai format seperti Excel (Worksheet 4.0), DBase III & IV, Access, Lotus dan sebagainya. SPSS juga boleh digunakan bagi memasukkan data. Bagi cara mewujudkan variabel dan memasukkan data dalam SPSS. Sila rujuk tajuk kita bincang sebelum ini.Begitu juga cara mengimpot data format lain ke dalam SPSS.
Jenis-jenis Ujian bagi Data Kuantitatif Parametrik Ujian T Independent (Student's T-Test) Ujian T berpasangan ANOVA Korelasi & Regresi Non-parametrik Wilcoxon Rank Sum test Mann Whitney test Kruskal Wallis
2. Ralat
Walaupun telah ditetapkan batas kemaknaan dan selang keyakinan, masih lagi timbul kemungkinan ralat. Ada 2 jenis ralat iaitu Ralat Jenis I dan Ralat Jenis II. Keadaan Sebenar Kesimpulan Ujian Kemaknaan Hipotesis Nul Tidak Ditolak Hipotesis Nul Ditolak Hipotesis Nul Benar (Ho tidak ditolak) Kesimpulan Benar Ralat Jenis I Hipotesis Nul Tidak Benar (Ho ditolak) Ralat Jenis II Kesimpulan Benar
46
Ralat Jenis I - menolak hipotesis nul sedangkan hipotesis ini adalah benar (e.g. didapati bahawa apabila dibandingkan nilai min/perkadaran, wujud perbezaan yang kecil tetapi perbezaan itu didapati signifikan. Oleh itu hipotesis null ditolak. Mungkin disebabkan oleh masalah seperti saiz sampel terlalu besar) Ralat Jenis II - tidak menolak hipotesis nul sedangkan hipotesis ini salah (e.g. didapati bahawa apabila dibandingkan nilai min/perkadaran, didapati wujud perbezaan tetapi perbezaan itu didapati tidak signifikan. Oleh itu hipotesis null tidak ditolak. Mungkin disebabkan oleh masalah seperti saiz sampel terlalu kecil)
47
BAB IV.
Pengenalan Asas SPSS. 4
Pendahuluan kepada sistem analisa berkomputer.

Kita sudah berada dizaman yang senang dan canggih. Setiap analisa yang dilaku pada masa kini menggunakan kecanggihan yang ada iaitu program komputer yang di reka khas untuk statistik. Selain dari SPSS, pelbagai program dipasaran boleh digunakan untuk menganalisa sesuatu kajian. Epi info yang dikeluarkan of CDC WHO juga selalu digunakan oleh para pengkaji. Epi info boleh didapati (muat turun) secara percuma dari website http://www.cdc.gov/epiinfo/installation.htm. SPSS Adalah salah satu program komputer yang digunakan dalam menyimpan, menganalisa dan mengolah data statistik kajian. SPSS adalah ringkasan kepada Statistical Product and Service Solution. Ia boleh didapati melalui laman web www.spss.com. SPSS juga mempunyai cawangannya di Malaysia. Maklumat lanjut boleh didapati melalui website http://www.spss.com.my/. Buku panduan ini dan kursus yang kita jalankan ini akan menggunakan SPSS versi yang ke 12.0 (salinan evaluasi/evaluation copy)
Bagaimana bermula
Pembelajaran ini boleh dilakukan 2 cara, iaitu secara online atau melakukan latihan dengan menggunakan nota yang telah disediakan. Untuk mendapatkan nota tersebut, anda boleh download fail-fail berikut dan cetakkannya menggunakan printer (format Adobe Acrobat *.pdfsila lawat laman web http://161.142.92.99/hululangat/stat/). Anda juga dibekalkan dengan beberapa fail tambahan untuk dijadikan bahan latihan. Bahan tersebut adalah sga.sav, sga.dbf. Anda juga boleh mendapat nota tambahan dari laman web http://www.iiumedic.net/biostatistics/v1/ yang ditulis oleh Dr Jamaluddin Abdul Rahman, Pensyarah Biostatistik UIA Kuantan1.
48
Menyimpan data bagi tujuan analisis.

Apabila sesuatu kajian dilakukan, banyak data dikumpulkan dari cerapan, soal selidik, pemeriksaan klinikal atau ujian makmal. Data tersebut dalam pelbagai bentuk dan jenis. Ada data kualitatif, kuantitatif atau pengenalan (identifier). Selepas dikumpulkan, data ini biasanya akan dimasukkan ke dalam buku data/rekod terlebih dahulu. Data ini biasanya disusun seperti dibawah;
norekod 1 2 3 4 5 6 7 8 9 10 umur 35 24 36 21 21 20 34 29 37 30 etnik Malay Malay Malay Malay Malay Malay Malay Malay Malay Malay pekan KB PASIRMAS KB BACHOK KB KBKERIAN KB BACHOK KB BACHOK Marital Married Married Married Married Married Married Married Married Married Married sekolah Secondary Secondary Secondary Secondary Secondary Secondary Nil Secondary Secondary Secondary jenisker Housewife Field work Housewife Housewife Field work Housewife Housewife Field work Housewife Housewife ahliisiru 5 2 7 2 10 2 10 5 7 4 pariti 3 1 6 1 1 1 9 2 5 2
Baris teratas sekali adalah nama variabel, data individu kemudiannya disusun berturutan dibawah. Selepas itu barulah data ini dimasukkan ke dalam perisian SPSS. Sebelum data dimasukkan, perlulah kita menyediakan tempat variabel tersebut di dalam SPSS. Senaraikan nama-nama variabel tersebut terlebih dahulu dan jenisnya sama ada kategorikal (string dalam SPSS) atau numerikal (numeric dalam SPSS). Nama-nama variabel itu hendaklah mengikuti syarat-syarat berikut;

Unik berbeza antara satu sama lain Hanya 8 huruf atau kurang Hanya menggunakan alphanumeric, tiada simbol seperti %.,*& atau SPACE Mempunyai makna tertentu agar mudah difahami e.g. n1rekod yang memberi makna soalan pertama mengenai nombor rekodnya.
Tidak dimulai dengan nombor.

49
Penkodan bagi setiap variabel juga hendaklah ditentukan terlebih dahulu (e.g. bagi etnik M=Melayu, C=Cina etc) bagi data yang ingin dimasukkan.
Mencipta variabel dalam SPSS

1. Mula-mula buka perisian SPSS. (Klik START > PROGRAMS > SPSS 12.0 FOR WINDOWS). 2. Anda akan melihat tetingkap seperti di bawah. Tetingkap ini dikenali sebagai DATA EDITOR. Berikut adalah penjelasan mengenai tetingkap tersebut.
3. Bawa cursor ke Menu DATA > DEFINE VARIABLE (Atau bawa cursor ke atas baris nama variabel pertama, right-click dan pilih DEFINE VARIABLE). Requester
50
berikut akan kelihatan.
4. Masukkan nama variabel di petak Variable Name. Bagi contoh ini, masukkan "norekod". Selepas itu klik pada button Type. Requester berikut akan kelihatan.
5. Memandangkan variabel norekod hanyalah variabel identifier dan tidak akan dianalisa, pilih jenis string dan bilangan character sebagai 3 (kerana jumlah cerapan kes ialah 218 orang, maka perlu 3 petak). Klik CONTINUE. Pada requester sebelumnya, klik COLUMN FORMAT pula. Requester berikut akan kelihatan.
51
6. Isikan COLUMN WIDTH sebagai 8 dan TEXT ALIGNMENT sebagai CENTER. Ini akan memudahkan kita semasa memasukkan data kelak. Selepas itu klik pada OK. Variabel yang tertera di DATA EDITOR adalah seperti berikut.
7. Lakukan perkara yang sama bagi variabel seterusnya (rujuk kepada lampiran data) iaitu;
Latihan 1: Memasukkan nama variable dan jenis dalam SPSS spreadsheet
Sila masukkan nama variable dan jenis ke dalam SPSS spreadsheet Variable Name Age Race Residenc Marital Education Typework Type Column Bilangan Formatting (Decimal) (Width) 3 1 4 1 8 0 7 0 8 1 8 1
LATIHAN 1
Numeric Numeric String Numeric Numeric Numeric
52
8. Di akhir sesi di atas, anda akan mendapat DATA EDITOR sedemikian.
Memasukkan label ke dalam variabel

1. SPSS mempunyai satu kelebihan yang unik, iaitu ia boleh menayangkan maksud sebenar penkodan melalui arahan VALUE LABELS. Apabila arahan ini digunakan, data yang dimasukkan dalam bentuk kod (eg 1 bagi Melayu, 2 bagi Cina) akan tertera dalam maksud sebenarnya iaitu Melayu atau Cina. Bagi menjelaskannya dengan lebih lanjut, kita akan lakukan latihan selanjutnya. 2. Right click di atas nama variabel RACE dan pilih DEFINE VARIABEL. Kemudian pilih butang LABELS. Requester berikut akan kelihatan.
3. Masukkan perkataan RACE dalam petak VARIABLE LABEL. Pada petak VALUE, masukkan nilai 1. Kemudian masukkan perkataan MALAY dalam petak VALUE LABEL. Tekan butang ADD. Lakukan yang sama bagi 2=CHINESE, 3=INDIAN dan
53
4=OTHERS.
Hasil
akhir
patutnya
sedemikian.
4. Tekan butang CONTINUE dan kemudian butang OK. 5. Sebagai percubaan masukkan nilai 1, 2, 3 dan 4 pada kolum RACE seperti rajah dibawah.
6. Tekan pada butang VALUE LABELS.(dalam bulatan merah dibawah)
7. Data tadi akan kelihatan sedemikian. Inilah gunanya label. Label yang sama akan digunakan dalam jadual, rajah dan apa jua hasil yang diterbitkan dari variabel ini. Oleh itu lebih baik anda menggunakan label sepertimana yang anda inginkan ia akan tertera dalam laporan akhir kelak (eg English atau Bahasa Malaysia) kerana
54
rajah atau jadual yang terhasil boleh ditampal (paste) terus dari SPSS ke word processor seperti Word 2003.
Latihan 2: Latihan melengkapkan variable dan label dalam SPSS spreadsheet Sebagai latihan, lengkapkan label-label berikut.
Variabel Marital
Label 0=single 1=married 2=divorced/widowed 1=Nil 2=Primary 3=Secondary 4=Tertiary 1=Housewife 2=Office work 3=Fieldwork
LATIHAN 2
Education
Typework
Transform Data [Compute & Recode]

Adakalanya data yang dikumpulkan tidak memenuhi keperluan analisa. Oleh itu perlu diubah terlebih dahulu. Bagi latihan kali ini, kita akan menggunakan data dari CD yang diberi iaitu fail sga.sav.
55
1. Compute
1. Buka fail tersebut dengan mengklik pada menu FILE>OPEN. Tukarkan ke directory ke (CD:)>bahan>latihan dr azmi>. Pilih fail sga.sav dan klik OPEN. (Jika data disimpan dalam CD:) 2. Kini kita akan menghasilkan satu variabel baru iaitu BMI (Body Mass Index) dari variabel WEIGHT1 (berat semasa trimester pertama) dan variabel HEIGHT (tinggi responden). Formula BMI adalah berat (kg)/tinggi2 (m2). 3. Klik pada menu TRANSFORM>COMPUTE (seperti dalam rajah).
4. Requester COMPUTE VARIABLE akan tertera. Lengkapkannya seperti rajah di bawah. Lepas tu klik OK.
56
5. Sekarang lihat pada DATA EDITOR, akan kelihatan variabel baru BMI yang terhasil (anda mungkin terpaksa scroll ke kanan).
Latihan 3: Sila buat pengiraan mengikut formula yang anda ketahui bagi:
LATIHAN 3
i. Menukarkan berat bayi dari KG ke bentuk Gram ii. Menukarkan berat badan ibu semasa semester pertama dari KG ke bentuk Gram dan iii. sila buat pengiraan perbandingan berat badan ibu semasa semester pertama dengan berat bayi.
2. Recode
1. Kini kita akan recode AGE (umur) dari data selanjar kepada AGEGROUP (kumpulan umur) iaitu <=20, 21-30, 31-40 dan >40. 2. Klik pada menu TRANSFORM>RECODE>INTO DIFFERENT VARIABLES.
57
3. Dalam requester yang terhasil, pilih AGE dari petak kiri dan tekan pada ARROW ke kanan. Kemudian isikan AGEGROUP dalam petak OUTPUT VARIABLE:NAME dan klik pada CHANGE. Ianya akan kelihatan seperti di bawah.
4. Sekarang klik pada butang OLD AND NEW VALUES. Requester berikut akan tertera.
5. Pilih seperti di atas dan klik ADD. Tukar 21-30 kepada VALUE 2, 31-40 kepada VALUE 3, 41 THROUGH HIGHEST kepada VALUE 4. Apabila selesai, ianya akan kelihatan seperti di bawah. Tekan CONTINUE dan kemudian OK.
58
6. Apabila discroll ke kanan, akan kelihatan variabel baru iaitu AGEGROUP. Bagi melengkapkan langkah ini, masukkan label melalui DATA>DEFINE VARIABEL bagi AGEGROUP. Labelnya ialah 1= "less than 21 years", 2="21 to 30 years", 3="31 to 40 years" dan 4=">40 years". Latihan 4: Sila recode beberapa variable dibawah: i. Recodekan AGE sekali lagi kepada variabel baru iaitu AGERISK yang terdiri dari 2 kumpulan iaitu mereka yang berumur 19 hingga 35 tahun (dikodkan sebagai 0) dan berumur 36 tahun ke atas (dikodkan sebagai 1). ii. Recodekan BMI kepada kategori berikut: BMI < 18 18.1 25.0 25.1 27.0 27.1 30.0 30.1 35.0 >35 Kod 1 2 3 4 5 6 Kategori Low BMI Normal BMI Overweight Obese type 1 Obese type 2 Morbidly obese
LATIHAN 4
iii. Recodekan Hemoglobin 2 (lowest hb at 2) kepada kategori berikut; kurang dari 9 (kod 1) anemia teruk, 9 hingga 11.0 (kod 2) untuk anemia sederhana dan lebih dari 11 (Kod 3) normal. iv. Bagaimana kita ingin recode data yang missing atau tiada input? Contoh: Variable Reflolux
59
Menjelajah (Exploring)
In exploring your data, you will be producing summary statistics and graphical displays, either for all the collected data or separately for groups of cases. There are many reasons why you would want to explore your data, among them are

Data screening Outlier identification Description Assumption checking Identifying characterizing differences among groups of cases (subpopulations)
Data screening may show that you have unusual values, extreme values, gaps in the data or other peculiarities. By exploring the data, it can help determine whether

the statistical techniques chosen would be appropriate you need to transform the data prior to analysis you may need to conduct non-parametric tests
Among the statistical output and plots that would help in exploring the data are;

Mean, median, 5% trimmed mean, standard error, variance, standard deviation minimum, maximum, range, interquartile range, skewness and kurtosis and their standard errors, confidence interval for the mean (and specified confidence level), percentiles
Hubers M-estimator, Andrews wave estimator, Hampels redescending Mestimator, Tukeys biweight estimator, the five largest and five smallest values, the Kolmogorov-Smirnov statistic with a Lilliefors significance level for testing normality, and the Shapiro-Wilk statistic.
60
Boxplots, stem-and-leaf plots, histograms, normality plots, and spread-versuslevel plots with the Levene test and transformations.
1. Explore dalam SPSS
Data yang digunakan dalam latihan ini ialah data sga.sav.
1. From the menus choose: Analyze Descriptive Statistics Explore...
Select one or more dependent variables. Optionally, you can:
Bagi contoh ini, sila pilih AGE (umur)
Select one or more factor variables, whose values will define groups of cases.
Sila pilih variabel CASE (kes atau kawalan)
Select an identification variable Sila pilih variabel INDEX. to label cases. Click Statistics for robust estimators, outliers, percentiles, and frequency tables. Click Plots for histograms, normal probability plots and tests, and spread-versus-level plots with Levenes statistic. Click Options for the treatment of missing values.
Pilih DESCRIPTIVES dan OUTLIERS.
Pilih HISTOGRAM dan STEM-N-LEAF.
61
Selepas itu tekan OK. Berikut adalah antara hasil yang akan kelihatan pada tetingkap DATA OUTPUT. Explore SGA
62
63
AGE Histograms
Stem-and-Leaf Plots AGE Stem-and-Leaf Plot for CASE= Normal Frequency Stem & Leaf 1.00 1 . 9 5.00 2 . 01111 9.00 2 . 222233333 15.00 2 . 444444555555555 17.00 2 . 66666666666677777 10.00 2 . 8888899999 14.00 3 . 00000000111111 8.00 3 . 23333333 9.00 3 . 444445555 7.00 3 . 6667777 6.00 3 . 888889 3.00 4 . 111 1.00 4 . 3 2.00 4 . 44 1.00 4 . 6 Stem width: 10 Each leaf: 1 case(s)
64
Berdasarkan dari hasil ini dapat dilihat tentang taburan data yang ada, sama ada ianya normal atau tidak, ada atau tidak nilai outlier dan sebagainya.
Latihan 5: Sila explore variabel-variabel seperti berikut:
i. WEIGHT1 sebagai Dependent Variable. ii. WEIGHT2 sebagai dependent Variable. iii. WEIGHTG1 sebagai dependent Variable iv. WEIGHTG2 sebagai dependent Variable v. WEIGHTG3 sebagai dependent Variable Bincangkan hasil yang anda perolehi. Sudah pasti anda akan menemui sesuatu yang menyeronokkan!
LATIHAN 5
Frekuensi (Frequency)
The Frequencies procedure provides statistics and graphical displays that are useful for describing many types of variables. For a first look at your data, the Frequencies procedure is a good place to start. For a frequency report and bar chart, you can arrange the distinct values in ascending or descending order or order the categories by their frequencies. The frequencies report can be suppressed when a variable has many distinct values. You can label charts with frequencies (the default) or percentages.
1. Statistics and plots

Frequency counts, percentages, cumulative percentages, mean, median, mode, sum, standard deviation, variance, range, minimum and maximum values, standard error of the mean, skewness and kurtosis (both with standard errors), quartiles, user-specified percentiles, bar charts, pie charts, and histograms.
65
2. Mencari Frequency dalam SPSS

To Obtain Frequencies and Statistics From the menus choose: Analyze Descriptive Statistics Frequencies... Sebagai permulaan, kita akan pilih variabel kategorikal iaitu CASE. Terus klik OK selepas itu. Hasilnya seperti dibawah.
Select one or more categorical or quantitative variables.
Pilih FREQUENCIES sekali lagi. Tekan butang RESET. Kini kita akan memilih variabel numerikal pula iaitu variabel NTVISIT (bilangan lawatan antenatal). DESELECT petak DISPLAY FREQUENCIES TABLE.
Click Statistics for descriptive Pilih MEAN, MODE, MEDIAN, VARIANCE, statistics for quantitative MINIMUM, MAXIMUM, STANDARD DEVIATION, variables. SKEWNESS & KURTOSIS (seperti rajah).
66
Click Charts for bar charts, pie charts and histograms.
Pilih HISTOGRAMS WITH NORMAL CURVE.
Click Format for the order in Di sini boleh select untuk suppress table yang which results are displayed. lebih dari 10 kategori.
Tekan OK. Berikut adalah keputusan yang terhasil.
LATIHAN 6
Latihan 6: Sila dapatkan frekuensi variabel berikut
1. Hemoglobin 3rd Trimester 2. Weght Gain at 3rd Trimester Perkara yang sama boleh juga didapati cari arahan STATISTICS > SUMMARISE > DESCRIPTIVES.
67
Penjelasan Data (Descriptives)

The Descriptives procedure displays univariate summary statistics for several variables in a single table and calculates standardized values (z scores). Variables can be ordered by the size of their means (in ascending or descending order), alphabetically or by the order in which you select the variables (the default). When z scores are saved, they are added to the data in the Data Editor and are available for SPSS charts, data listings, and analyses. When variables are recorded in different units (for example, gross domestic product per capita and percentage literate), a z-score transformation places variables on a common scale for easier visual comparison. Statistics available here is Sample size, mean, minimum, maximum, standard deviation, variance, range, sum, standard error of the mean, and kurtosis and skewness with their standard errors.
Latihan 7: Cuba lakukan sendiri arahan DESCRIPTIVES bagi variabel numerikal seperti di bawah 1. Total Family members 2. Number of antenatal visit. 3. Weight 2 LATIHAN 7 4. Weight 3 5. Height
Bincangkan penemuan anda dengan Fasilitator..
68
Impot Fail, Copy & Paste

1. Impot
SPSS boleh mengimpot data dari pelbagai perisian yang lain. Antaranya adalah; SPSS. Opens data files saved by SPSS for Windows, Macintosh, UNIX, and also by the DOS product SPSS/PC+. SPSS/PC+. Opens SPSS/PC+ data files. SYSTAT. Opens SYSTAT data files. SPSS portable. Opens SPSS data files saved in portable format. Saving a file in portable format takes considerably longer than saving the file in SPSS format. Excel. Opens spreadsheet files saved in Excel 4 or earlier versions. For Excel 5 or later versions, use Open ODBC with the appropriate Excel ODBC driver. Lotus 1-2-3. Opens data files saved in 1-2-3 format for release 3.0, 2.0, or 1A of Lotus. SYLK. Opens data files saved in SYLK (symbolic link) format, a format used by some spreadsheet applications. dBASE. Opens dBASE format files for either dBASE IV, dBASE III or III PLUS, or dBASE II. Each case is a record. Variable and value labels and missing-value specifications are lost when you save a file in this format. Tab-delimited. Opens ASCII text data files with data values separated by tabs.
a)
Contoh
Untuk mengimpot fail adalah mudah. Contoh yang ingin ditunjukkan adalah dari format dBaseIV iaitu sga.dbf 1. Klik menu FILE>OPEN. Pada requester yang tertera, tukarkan ke 3.5" Floppy (A:) dalam petak LOOK IN. Pada petak FILES OF TYPE, pilih jenis dBase (*.dbf). Akan kelihatan nama fail sga.dbf pada senarai fail. Pilih fail sga.dbf dan klik OPEN. (Ini jika fail sga.dbf telah disimpan dalam disket a:)
69
2. Data akan masuk terus kepada DATA EDITOR dan pernyataan pemprosesan akan disebutkan dalam tetingkap DATA OUTPUT. Yang ganjil bagi dBase hanyalah akan ada satu Variabel d_r pada kolum pertama yang perlu dipadamkan. SELECT pada lajur (column) d_r, lepas itu klik pada menu EDIT>CLEAR.
3. Selepas ini bolehlah diubahsuai variabel yang telah diimpot dengan menggunakan arahan DEFINE VARIABLE.
Latihan 8: Import file format Excel (yang terdapat dalam CD - nama file: Drug T Student) ke SPSS
Kejayaan anda mengimport data tersebut adalah kejayaan fasilitator jua....yang berusaha untuk menjadikan anda seorang yang pandai SPSS.
LATIHAN 8
70
COPY & PASTE 1. Ada 2 benda yang kerap di"copy & paste" dari SPSS iaitu jadual dan graf. Yang paling mudah adalah graf, jadi kita akan mulakan dengannya terlebih dahulu. 2. Pastikan perisian word processor (eg Word ) dan SPSS kedua-dua telah dibuka terlebih dahulu. Select graf yang ingin disalin dari tetingkap DATA OUTPUT dengan left-click di atasnya sekali. Akan kelihatan petunjuk merah dikirinya.
3. Selepas itu klik pada menu EDIT>COPY OBJECTS (atau CTRL+ALT+C). Klik pada TASKBAR untuk pergi ke WORD . Klik pada EDIT>PASTE (CTRL+V). Boleh juga pilih PASTE SPECIAL, pastikan jenis FORMATTED RTF/DOC yang dipilih. 4. Yang ditampal itu mempunyai sama sifat seperti imej yang lain. Jika ingin merubah apa-apa yang tidak kena, harus dilakukan dalam SPSS terlebih dahulu, sebelum ditampal. 5. Bagi menyalin jadual pula, pastikan perisian Excel turut dibuka. Select jadual yang ingin disalin dari tetingkap DATA OUTPUT dengan left-click di atasnya sekali. Akan kelihatan petunjuk merah dikirinya.
6. Selepas itu klik pada menu EDIT>COPY OBJECTS (atau CTRL+ALT+C). Klik pada TASKBAR untuk pergi ke WORD . Klik pada EDIT>PASTE (CTRL+V). Ia akan kelihatan seakan-akan sama seperti dalam DATA OUTPUT. Malangnya tidak boleh diklik langsung, jika diklik, jadual itu akan jadi haru-biru. Jika ingin merubah apa-apa yang tidak kena, harus dilakukan dalam SPSS terlebih dahulu, sebelum ditampal.
72
7. Cara yang lebih baik adalah dengan menggunakan EXCEL . Seperti sebelum ini, select jadual tersebut terlebih dahulu. Tetapi semasa copy, gunakan arahan EDIT>COPY (atau CTRL+C). Gunakan TASKBAR untuk ke EXCEL dan EDIT>PASTE (CTRL+V).
8. Ubahsuai dengan menggunakan arahan EXCEL yang biasa. Select semula jadual ini di dalam EXCEL , COPY dan barulah di PASTE di dalam WORD .
Jadual 1: Frekuensi anemia di kalangan ibu mengandung di HUSM Frekuensi Normal Anemia Tidak tahu Jumlah 196 18 4 218 Peratus 89.90826 8.256881 1.834862 100
LATIHAN 9
Latihan 9: Cuba lakukan sendiri COPY & PASTE dengan menggunakan rajah dan jadual yang lain.
73
Select And Deselect Case

Jika anda mahu menganalisa kes-kes yang terpilih sahaja dalam SPSS, anda boleh melakukanya dengan menggunakan kaedah select and deselect case. Caranya seperti berikut dalam SPSS: 1. Buka file sga.sav 2. Klik Data>Select case
3. Menu select case akan keluar.
74
4. Pilih If Condition is statisfied. 5. Pilih variabel smoking dalam panel sebelah kiri dan tekan anak panah kecil antaranya untuk dimasukkan kedalam dialog box. 6. Anda mesti bijak dalam mengatur formula. Apa yang anda kehendaki sekarang adalah anda akan menganalisa pesakit yang terlibat dalam pasif smoking sahaja. Oleh yang demikian anda terpaksa menyisihkan data yang pesakit yang tidak terlibat dalam pasif smoking. 7. untuk itu variable smoking ~= 0 adalah formula yang sesuai untuk itu.
8. Klik continue>ok 9. Kita akan melihat tanda / bagi kies-kes yang tidak akan dimasukkan dalam analisa.
Kes ini tidak akan dimasukkan dalam analisa anda
75
10. Perlu diingat bahawa anda dikehendaki deselect semula untuk menjalankan analisa lain. Klik Data->Select Case->Click button Reset.
76
BAB V.
Analisa Parametrik
Z-Test (Ujian Z)
Data jenis kualitatif akan menggunakan perkadaran. Untuk membandingkan nilai perkadaran antara kumpulan kajian, ujian statistik seperti Z-Test dan ujian khi kuasa dua boleh digunakan.
a)
Z-Test
Digunakan bagi membandingkan 2 perkadaran. Formula; z= p1 - p2 p0q0 [1/n1 + 1/n2] di mana p1 adalah perkadaran kejadian 1 = a1/n1 p2 adalah perkadaran kejadian 2 = a2/n2 a1 dan a2 ialah kejadian 1 dan 2 p0 = p1n1 + p2n2 n1 + n2 q0 = 1 p0 Digunakan jadual taburan normal untuk menolak atau tidak menolak hipotesis nol.
Contoh kiraan
Perbandingan kadar infestasi cacing antara lelaki dengan perempuan di sebuah sekolah. Kadar lelaki = 29/96 = 0.302 Kadar perempuan =24/104 = 0.231
77
Dari kiraan p0 = 29 + 24 = 0.2651 96+104 q0 = 1 0.2651 = 0.7349 z= 0.302 - 0.231 = 1.1367
(0.735*0.265) [1/96 + 1/104] Dari jadual taburan normal, nilai z yang bermakna pada batas kemaknaan 0.05 ialah 1.96. Maka nilai z kiraan lebih kecil dari 1.96, maka tidak wujud perbezaan yang bermakna antara 2 perkadaran tersebut.
Ujian Khi Kuasa Dua [2]

Digunakan untuk menguji digunakan untuk menguji sama ada terdapat hubungan di antara 2 pembolehubah kualitatif. Mula-mula data akan disusun dalam jadual kontigensi mengikut cerapannya (observasi). Kemudian dikira nilai jangkaannya, berdasarkan jumlah lajur dan baris dalam jadual observasi. Cara pengiraannya adalah seperti dibawah; Jadual observasi + + a c e b d f g h n
78
Jadual jangkaan
+ + eg/n eh/n e
fg/n fh/n f g h n
Nilai khi kuasa dua dikira dengan menjumlahkan (cerapan jangkaan)2/jangkaan bagi sel jadual. X2 = Jumlah (O-E)2 E darjah kebebasan dk = (jumlah baris 1) (jumlah lajur 1) Contoh kiraan Jadual observasi + + Jadual jangkaan + + 96*53/200 96*147/200 e 104*53/200 104*147/200 f g h n 29 67 24 80
79
Ini akan memberi nilai; + + 25.44 70.56 e 27.56 76.44 f g h n
Maka nilai X2 = (29 25.44)2 + (24 27.56)2 + (67-70.56)2 + (80 76.44)2 25.44 X2 = 1.303 Dilihat pada jadual bagi X2 pada df=1 dan batas kemaknaan 0.05, nilainya ialah 3.84. Oleh kerana nilai X2 kiraan lebih kecil dari nilai X2 jadual, maka tidak wujud perbezaan perkadaran yang bermakna. Oleh itu hipotesis null tidak ditolak. 27.56 70.56 76.44
1. Cara mengira X2 dengan menggunakan SPSS

Bagi contoh ini, ianya adalah data dari sebuah kajian kes-kawalan mengenai kes SGA (small for gestational age) di HUSM. Yang ingin dilihat ialah sama ada wujud hubungan di antara faktor risiko passive smoking dengan kejadian SGA. Kedua-dua variabel adalah kualitatif iaitu Case (SGA/Normal) dengan Smoking (No/Active/Passive). 1. Mula-mula buka data tersebut 2. Kemudian klik pada menu analyze ->Descriptive Statistics ->Crosstabs (seperti rajah dibawah).
80
3. Pada requester yang timbul, isikan variabel yang ingin dilakukan ujian tersebut. Biasanya faktor risiko (SMOKING) diletakkan pada baris (row) dan penyakit (CASE) diletakkan di lajur (column). Boleh masukkan lebih dari satu variabel kuantitatif yang ingin diuji. Klik butang Statistics dan pilih chi-square. Tekan "continue" dan kemudian tekan butang CELLS. Pilih ROW PERCENT. Tekan CONTINUE dan tekan butang "OK".
81
4. Selepas ini ujian chi-square akan dilakukan oleh SPSS dan tingkap "Output" akan timbul menunjukkan hasil analisa. Yang akan kelihatan adalah seperti dibawah;
5. Ini menunjukkan bahawa dikalangan perokok pasif, peratus SGA lebih tinggi iaitu 57.1% berbanding dengan yang tidak iaitu 42.9%. Dari jadual seterusnya, nilai chi square ialah 10.328 dan nilai p ialah 0.001. Maka terbukti ada hubungan bermakna antara perokok pasif dan kejadian SGA. 6. Maka jadual yang dilukis bagi laporan tesis adalah seperti di bawah;
82
Jadual 1: Jadual kontigensi menunjukkan hubungan antara risiko rokok dengan kejadian SGA. Kumpulan Tidak merokok Perokok pasif Jumlah X2 = 10.328, p = 0.001 7. Sekiranya dalam jadual 2x2, ada nilai sel jangkaan yang kurang dari 5, maka nilai p dan nilai X2 yang dibaca ialah nilai p di baris CONTINUITY CORRECTION. Ianya adalah serupa seperti kiraan Yates Correction. Tetapi sekiranya saiz sampel lebih kecil iaitu kurang dari 40, maka nilai yang dibaca ialah nilai p dan nilai X2 pada baris Fishers' Exact Test.
Latihan 10: Sila buat interpretasi jadual berikut
Normal 41 67 108
SGA 20 89 109
Jumlah 61 156 217
LATIHAN 10
Ujian T Independent
Untuk membandingkan min 2 kumpulan yang tidak bersandar (independent). Contohnya min Hb di antara kes dan kawalan. 2 variabel akan terlibat iaitu satu variabel kuantitatif dan satu lagi variabel kualitatif dengan hanya 2 kemungkinan (e.g. jantina lelaki dan perempuan).
83
Formula umum; t=
Formula khusus; Jika n sama atau lebih besar dari 30
Jika n lebih kecil dari 30
t=
t= di mana; = di mana darjah kebebasan; df = (n1+n2-2)
1. Cara melakukan Ujian T Independent Menggunakan SPSS

Bagi contoh ini, ianya adalah data dari sebuah kajian keberkesanan antara 2 jenis ubat (drug = F dan S) bagi pesakit psikiatri. Hanya mereka yang lengkap rawatan (status=C) dipilih. Yang dibandingkan ialah perubahan skor HAMD selepas 6 minggu rawatan (chhamd6) antara 2 kumpulan tersebut.
1. Mula-mula buka data tersebut 2. Kemudian klik pada menu Analyze ->Compare Means ->Independent Samples T Test (seperti gambarajah di bawah)
84
4. Pada requester yang timbul, isikan variabel yang ingin dilakukan ujian tersebut. Pada petak "Test Variable(s):", masukkan variabel kuantitatif (chhamd6) yang ingin diuji. Boleh masukkan lebih dari satu variabel kuantitatif yang ingin diuji.
5. Pada petak "Grouping Variable:", masukkan variabel kualitatif (drug), kemudian klik pada butang "Define Groups" dan masukkan kumpulan yang ingin dibandingkan (S & F). Klik butang "continue" dan kemudian butang "okay".
6. Selepas ini ujian t independent akan dilakukan oleh SPSS dan tingkap "Output" akan timbul menunjukkan hasil analisa. Yang akan kelihatan adalah seperti
dibawah;
7. Ini menunjukkan jumlah sampel (N), min dan sisihan piawai bagi chhamd6 bagi kumpulan S dan F.
8. Mula-mula sekali lihat nilai p (Sig.) pada Levene's Test. Jika p>0.05, maka gunakan baris "equal variances assumed". Jika p<0.05, gunakan baris "equal variances not assumed". Bagi kes di atas, p=0.850, maka kita akan gunakan baris "equal variances assumed". Dapat dilihat bahawa nilai p = 0.755, iaitu p>0.05, maka tidak wujud perbezaan dari segi perubahan skor HAMD di antara 2 ubat tersebut selepas 6 minggu rawatan. 9. Maka jadual yang dilukis bagi laporan tesis adalah seperti di bawah; Jadual 1: Min perubahan skor HAMD selepas 6 minggu rawatan mengikut kumpulan rawatan. Kumpulan S F N 32 34 Min 8.38+1.60 8.50+1.64 Ujian Ujian T t = 0.313 p 0.755
86
Ujian T Berpasangan
Digunakan apabila perbandingan variabel kuantitatif dilakukan pada individu yang sama. Contohnya apabila individu itu merupakan kedua-dua kawalan dan kes pada kajian yang sama, iaitu sebelum dan sesudah intervensi. Boleh juga digunakan bagi kes dan kawalan yang telah dipasangkan mengikut kriteria seperti umur, jantina dan etnik (matched pairs). Maka ia akan melibatkan 2 variabel kuantitatif yang berpasangan pada satu kajian. Formula yang digunakan ialah; Mula-mula dikira beza di antara nilai pertama dan nilai kedua bagi setiap individu dalam kajian = D. Kemudian dikira nilai min D dan sisihan piawainya. Dari 2 nilai tersebut, t dikira mengikut formula di bawah;
t= di mana = dan
1. Cara melakukan Ujian T Berpasangan menggunakan SPSS

Bagi contoh ini, ianya adalah data dari sebuah kajian keberkesanan rawatan hematinik kepada pesakit anemia. Yang dibandingkan ialah perbezaan antara Hb sebelum rawatan (hb2) dengan selepas 6 minggu rawatan (hb3). 1. Mula-mula buka data tersebut (file open: sga-pair.sav) 2. Kemudian klik pada menu Analyze ->Compare Means ->Paired-Samples T Test (seperti rajah dibawah).
87
3. Pada requester yang timbul, isikan variabel yang ingin dilakukan ujian tersebut. Pada petak "Test Variable(s):", masukkan pasangan variabel kuantitatif (hb2 & hb3) yang ingin diuji. Boleh masukkan lebih dari satu pasangan variabel kuantitatif yang ingin diuji. Klik butang "okay".
5. Selepas ini ujian t berpasangan akan dilakukan oleh SPSS dan tingkap "Output" akan timbul menunjukkan hasil analisa. Yang akan kelihatan adalah seperti dibawah;
6. Ini menunjukkan jumlah pasangan (N), min dan sisihan piawai bagi pasangan
hb2 & hb3 bagi pesakit yang diberi ubat hematinik.
7. Ini menunjukkan korelasi antara pasangan di atas. Tiada korelasi yang bemakna (p>0.05).
7. Dapat dilihat bahawa nilai p = 0.004, iaitu p<0.05, maka wujud perbezaan yang bermakna dari rawatan selepas 6 minggu rawatan bagi ubat hematinik. Daripada nilai min, dapat dilihat bahawa min selepas rawatan adalah lebih besar dari min sebelum rawatan. Ini bermakna pesakit semakin sembuh dengan rawatan ubat hematinik. 8. Maka jadual yang dilukis bagi laporan tesis adalah seperti di bawah; Jadual 1: Min Hemoglobin sebelum dan selepas 6 minggu rawatan hematinic.
Kumpulan Sebelum rawatan Selepas rawatan N 70 70 Min 10.24+0.36 10.59+0.97 Ujian Ujian T berpasangan t = -3.08 p 0.004
Latihan 11: Analisa file tbkkp.sav bagi pesakit tuberkulosis dengan menggunakan ilmu yang anda telah pelajari.
Cari p value bagi pesakit yang menjalankan rawatan ubat TB yang dikategorikan dalam sembuh dan seumpamanya
LATIHAN 11
89
ANOVA (Analysis of Variance)

Untuk membandingkan lebih dari 2 min kumpulan yang tidak bersandar (independent). Merupakan lanjutan dari ujian t. Contohnya min Hb di antara pelbagai kaum di Malaysia. 2 variabel akan terlibat iaitu satu variabel kuantitatif dan satu lagi variabel kualitatif dengan lebih dari 2 kemungkinan (e.g. ethnik - Melayu, Cina, India & Lain-lain). Hanya dapat menentukan sama ada terdapat perbezaan yang bermakna di antara minmin yang dibandingkan, tetapi tidak dapat yang mana satu yang berbeza. Untuk menentukan min yang mana satu bermakna, perlu analisa lanjut menggunakan ujian t (cara biasa) atau melalui pengiraan LSD dalam post-hoc analysis (cara SPSS). Formula umum;
Source of variation
Sum of Squares (variability)
Degrees of Freedom
Mean Square (Variance)
Variance Ratio (F)
Between a c a/c Groups ad/bc Within b d b/d Groups Daripada nilai F yang dikira, dirujuk kepada jadual F dan dipastikan sama ada nilai p kiraan melebihi atau kurang dari 0.05.
1. Cara melakukan ANOVA menggunakan SPSS

Bagi contoh ini, ianya adalah data dari sebuah kampung di Hulu Langat bagi semua penduduk yang berumur 18 tahun ke atas. Yang ingin dikaji adalah hubungan antara tahap obesiti (obesiti iaitu underweight, normal dan overweight) dan tekanan darah diastolik (c5diasto). 1. Mula-mula buka data tersebut 2. Kemudian klik pada menu Analyze ->Compare Means ->One-Way ANOVA (seperti rajah dibawah).
90
3. Pada requester yang timbul, isikan variabel yang ingin dilakukan ujian tersebut. Pada petak "Test Variable(s):", masukkan variabel kuantitatif (c5diasto) yang ingin diuji (lihat rajah di bawah). Boleh masukkan lebih dari satu variabel kuantitatif yang ingin diuji. Pada petak "Grouping Variable:", masukkan variabel kualitatif (obesity), kemudian klik pada butang Post Hoc.
4. Pada requester post-hoc, klik pada LSD (lihat rajah), tekan butang "continue".Kemudian tekan "Options" dan klik pada "Descriptives", tekan butang "continue" dan kemudian butang "okay".
91
5. Selepas ini analisa ANOVA akan dilakukan oleh SPSS dan tingkap "Output" akan timbul menunjukkan hasil analisa. Yang akan kelihatan adalah seperti dibawah;
Daripada nilai min dalam jadual deskriptif, dapat dilihat bahawa min diastolik semakin meningkat mengikut tahap obesiti.
6. Ini menunjukkan nilai F = 15.106 dan nilai p = 0.000. Maka wujud perbezaan yang bermakna di antara tahap obesiti dengan tekanan darah diastolik. Persoalannya adalah pada kumpulan mana satukah yang wujud perbezaan
92
bermakna tersebut? Untuk itu kita lihat pada post-hoc LSD.
7. Dapat dilihat bahawa nilai p<0.05 bagi kesemua perbandingan, maka wujud perbezaan yang bermakna dari segi tahap obesiti dengan tekanan diastolik di antara semua kumpulan. 8. Maka jadual yang dilukis bagi laporan tesis adalah seperti di bawah; Jadual 1: Min tekanan diastolik dan status obesiti. Kumpulan Kurang berat Normal Lebih Berat Min 78.20+10.46 82.23+14.05 87.97+11.75 ANOVA F = 15.106 0.0000 Ujian p
Latihan 12: Cari p value bagi satu kajian uang berkaitan dengan obesiti di daerah setiu
Buka file program-unggul.sav dalam folder latihan Dependent list: tchol; manakala factor: fat category. Bincangkan penemuan anda dengan ahli group.
LATIHAN 12
93
Korelasi (Correlation)
Untuk menentukan adanya hubungan antara 2 variabel kuantitatif yang bersandar (dependent). Pekali korelasi (r) mempunyai nilai minimum -1 dan nilai maksimum +1. -1 bermaksud korelasi sempurna negatif +1 bermaksud korelasi sempurna postif dan 0 bermaksud tiada korelasi langsung
Formula umum;
r= di mana =
1. Cara melakukan Korelasi menggunakan SPSS

Bagi contoh ini, ianya adalah data dari sebuah kampung di Hulu Langat bagi semua penduduk yang berumur 18 tahun ke atas. Yang ingin dikaji adalah hubungan antara BMI (Body Mass Index) dan tekanan darah diastolik (c5diasto). 1. Mula-mula buka data tersebut (Jika tidak tahu bagaimana, klik di SINI.) 2. Kemudian klik pada menu Analyze ->Correlate ->Bivariate (seperti rajah dibawah).
94
3. Pada requester yang timbul, isikan variabel yang ingin dilakukan ujian tersebut. Pada petak "Variables:", masukkan variabel kuantitatif yang ingin diuji iaitu (c5diasto) & (bmi) (lihat rajah di bawah). Boleh masukkan lebih dari dua variabel kuantitatif yang ingin diuji. Kemudian tekan butang "okay".
4. Selepas ini analisa korelasi akan dilakukan oleh SPSS dan tingkap "Output" akan timbul menunjukkan hasil analisa. Yang akan kelihatan adalah seperti dibawah;
95
6. Ini menunjukkan nilai r = 0.341 dan nilai p = 0.000. Maka wujud hubungan yang bermakna di antara BMI dengan tekanan darah diastolik. 8. Maka jadual yang dilukis bagi laporan tesis adalah seperti di bawah; Jadual 1: Korelasi tekanan diastolik dan BMI Variabel Tekanan Diastolik BMI r 0.341 p 0.0000
9. Nota: Dalam kajian statistik nilai r boleh ditafsirkan seperti berikut: 0.00 0.3 = korelari lemah 0.3 0.6 = Korelasi sederhana 0.6 1.00 = Korelasi Kuat
Latihan 13: Cari corelasi antara berat badan ibu semasa mengandung dengan berat bayi semasa lahir.
File yang kita akan gunakan adalah sga-korelasi.sav
Bincangkan penemuan anda
LATIHAN 13
96
Regresi (Regression)
Digunakan untuk mengukur hubungan fungsi antara 2 variabel kuantitatif, di mana satu variabel bersandar (dependent) dan satu lagi variabel tidak bersandar (independent). Formula yang digunakan ialah; y = a + bx di mana
b=
1. Cara melakukan Regresi menggunakan SPSS

Bagi contoh ini, ianya adalah data dari sebuah kampung di Hulu Langat bagi semua penduduk yang berumur 18 tahun ke atas. Yang ingin dikaji adalah hubungan antara BMI (Body Mass Index) dan tekanan darah diastolik (c5diasto). 1. Mula-mula buka data tersebut (Jika tidak tahu bagaimana, klik di SINI.) 2. Kemudian klik pada menu analyze ->Regression ->Linear (seperti rajah dibawah).
3. Pada requester yang timbul, isikan variabel yang ingin dilakukan ujian tersebut. Pada petak "Dependent:", masukkan variabel kuantitatif bersandar yang ingin diuji iaitu (c5diasto). Pada petak "Independent" masukkan variabel kuantitatif tidak
97
bersandar iaitu (bmi) (lihat rajah di bawah). Boleh masukkan lebih dari dua variabel kuantitatif tidak bersandar yang ingin diuji. Kemudian tekan butang "okay". Bagi "Method" terdapat pelbagai kaedah, sebagai contoh kita akan gunakan kaedah "Enter".
4. Selepas ini analisa regresi akan dilakukan oleh SPSS dan tingkap "Output" akan timbul menunjukkan hasil analisa. Yang akan kelihatan adalah seperti dibawah;
98
6. Yang dilihat adalah jadual yang terakhir di mana a = 64.145 dengan nilai p=0.000. Nilai b = 0.811 dengan nilai p = 0.000. Maka formula y=a+bx dapat ditulis sebagai; tekanan diastolik = 64.145 + (0.811).(BMI)
Latihan 14: Cari logistik regresion dalam masalah ibu merokok dan bayi SGA
File yang akan digunakan adalah sga-logistik.sav
LATIHAN 14
Nota ini berakhir di sini, mungkin cara pesembahan agak bercelaru, jika kecelaruan itu benar-benar wujud dalam benak anda, bermakna anda telah memahami hakikat statistik
99
Bahan Rujukan
1. Azmi Tamil, M Rizal Manaf. Bengkel Asas SPSS: Sesi Asas. October 2002. 2. http://bmj.bmjjournals.com/epidem/epid.1.html 3. John Gay. Clinical Study Design and Methods Terminology. August 22, 1999. (Online http://www.vetmed.wsu.edu/courses-jmgay/GlossClinStudy.htm) 4. Azmi Tamil. Pakej Belajar Sendiri SPSS. (Online http://161.142.92.99/hululangat/stat/spss.htm) 5. The Shodor Faoundation Education Program. Introduction to Statistics: Mean, Median, and Mode. (Online:http://www.shodor.org/interactivate/lessons/sm1.html) 6. Rahman JA. Biostatistics, 2009.
100
Index (Perkataan untuk dirujuk)

analisa, 48, 55, 79, 83, 86, 88, 90, 93, 95 Analysing, 14 analysis, 29, 32, 38, 41, 42, 44, 60, 88 Analysis, 44, 88 ANOVA, 46, 88, 91 association, 6, 7, 8, 12, 18, 19, 40, 43 Attributable risk, 6, 7 Azmi Tamil, 97 bias, 12, 13, 16, 17, 18, 19, 42, 43 Bias, 2 Birth rate, 5 Body Mass Index, 56, 92, 94 Case Report, 20 Case Series, 20 Chi-Square Test, 74 clinical, 14, 17, 19, 20, 30, 32 Cohort, 18 Compute, 55, 56 confounding, 8, 11, 18, 40 Cross-Sectional, 19 Data, 1, 35, 46, 48, 54, 59, 60, 68, 70, 74, 82, 85 Dikotomus, 1 Epi Info, vii Epidimiologi, 3 error, 12, 13, 15, 30, 38, 60, 65, 69 experimental study, 16, 17 exposure, 7, 11, 12, 13, 15, 16, 18, 19 False negatives, 15 False positives, 15 Fertility rate, 5 Formula, 81 Frekuensi, 3, 73 frequency, 11, 15, 43, 61, 65 Google, 25 Hulu Langat, 88, 92, 94 Infant mortality rate, 5 inferens, 45 Insiden, 4, 5 jadual, 54, 70, 71, 72, 73, 74, 75, 76, 77, 80, 84, 87, 88, 90, 91, 93, 96 Khi Kuasa Dua, 75 Komuniti, v
Konfounders, 2 Korelasi, 46, 91, 92, 93 Kualitatif, 1 Kuantitatif, 1 mammography, 16 Mean, 2, 60, 88, 97 Median, 2, 97 medline, 25, 26 menu, 56, 57, 70, 71, 72, 78, 82, 85, 88, 92, 94 Method, 95 mode, 2, 65 Mortaliti, 4 Null Hipotesis, 2 obesiti, 88, 91 observational, 7, 16, 18, 19, 20, 41 Output, 79, 83, 86, 90, 93, 95 overweight, 88 Ovid, 28 P, 2 Perinatal mortality rate, 5 Permaisuri, v PICO, 33, 34 Poliklinik, v Polinomial, 1 Populasi, 1 Prevalen, 4, 5 prevalence, 4, 10, 11, 12, 15, 16, 19, 38, 39, 43, 44 Prof Khoo, 33 Proquest, 28 quantitative, 32, 66 questionnaires, 12, 31 Ralat, 2, 46, 47 Randomized Cross-Over, 17 RCT, 17 Regresi, 46, 94 Relative risk, 6, 7 retrospective, 19 risk, 6, 7, 10, 11, 13, 18, 19, 31, 43 Rizal Manaf, 97 Sampel, 1 Sample, 3, 38, 69
a
sampling, 40, 41, 42 Sensitivity, 15 Specificity, 15 SPSS, i, v, vi, vii, 46, 48, 49, 50, 53, 55, 60, 65, 68, 69, 70, 71, 72, 77, 79, 82, 83, 85, 86, 88, 90, 92, 93, 94, 95, 97 statistik, v, 3, 6, 16, 45, 48, 74, 96 Statistik, 1, 3 Stillbirth rate, 5 Suhazeli, v T Berpasangan, 84, 85 T Independent, 81, 82 tekanan, 37, 88, 91, 92, 93, 94, 96
True positives, 15 underweight, 88 validity, 14, 40 variabel, 46, 49, 50, 51, 52, 53, 54, 56, 57, 59, 61, 65, 66, 69, 70, 77, 81, 82, 84, 86, 88, 89, 91, 92, 94, 95 Variabel, 1 variables, 35, 36, 37, 38, 61, 65, 66, 68 WHO, 48 xray, 16 Z-Test, 74
Lampiran: Kesesuaian Ujian Statistik Dengan Jenis Variabel

Sebagaimana yang diajar sebelum ini, secara amnya terdapat 2 jenis variabel utama iaitu variabel kualitatif (kategorikal) dan kuantitatif (numerikal; diskret & selanjar). Apabila kita ingin menguji hubungan antara 2 variabel (analisa bivariat), jenis ujian yang dilakukan bergantung kepada jenis variabel yang ingin diuji. Berikut adalah panduan am tentang jenis ujian yang boleh dilakukan, berdasarkan jenis variabel yang ingin diuji bagi bagi analisa bivariat. lanya dibahagikan kepada ujian parametrik (data bertabur normal) dan ujian non-parametrik (data tidak bertabur normal);
Jadual ujian parametrik bivariat

Variabel 1 Kualitatif Kualitatif Dikotomus Kualitatif Dikotomus Kualitatif Dikotomus K.ualitatif Polinomial Kuantitatif Variabel 2 Kualitatif Kualitatif Dikotomus Kualitatif Dikotomus Kuantitatif Kuantitatif Kuantitatif (paired) Kuantitatif selanjar Kriteria Saiz sampel > 20 dan tiada nilai jangkaan yang kurang dari 5 Saiz sampel > 30 Saiz sampel > 40 tetapi salah satu dari nilai jangkaan < 5 Data bertabur normal Data bertabur normal Ukuran berulang pada individu yang sama dan perkara yang sama (e.g. tahap Hb sebetum dan selepas rawatan). Data bertabur noimal Data bertabur normal Jenis Ujian Ujian X Ujian X
2
Ujian X dengan pembetulan Yates Ujian t Student ANAVA Ujian T berpasangan (paired T test) Korelasi Pearson & regresi linear
Kuantitatif selanjar
Jadual ujian non-pararnetrik bivariat

Variabel 1 Kualitatif Dikotomus Dikotomus Kualitatif Dikotomus Kualitatif Polinomial Kuantitatif Kuantitatif selanjar Variabel 2 Kualitatif Dikotomus (Unpaired) Dichotomous (Paired) Kuantitatif Kriteria Saiz sampel < 20 atau < 40 tetapi satah satu dari nilai jangkaan < 5 Data tidak bertabur normal Data tidak bertabur normal Jenis Ujian Ujian Fisher Exact Test
McNemar chi-square test Ujian hasil tambah pangkat Wilcoxon atau Ujian U Mann-Whitney Ujian ANAVA satu hala Kruskal-Wallis Ujian pangkat bertanda Wilcoxon Korelasi pangkat Spearman/Kendall
Kuantitatif Kuantitatif Kuantitatif selanjar
Data tidak bertabur normal Ukuran berulang pada individu yang sama dan perkara yang sama Data tidak bertabur normal
Ujian Secara Umum
Variabel 1 Dichotomous Nominal
Variabel 2 Nominal Nominal
Jenis Ujian Chi-square test Chi-square test
Langkah untuk Menganalisa data

Susun data Semakan Kategori dan coding
Kualitatif: Ringkasan data dalam bentuk diagram atau carta alir
Kuantitatif: Analisa Berkomputer
Semak kertas kajian dari sudut objektif dan methodologi
Perbincangan kumpulan
Pengumpulan data
Analisa semula merujuk kepada objektif yang ditetapkan
Laporan kajian
NOTA tambahan

Metodologi Booklet Edited May 2012

Uploaded by

Copyright:

Available Formats

Metodologi Booklet Edited May 2012

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Metodologi Booklet Edited May 2012

Uploaded by

Copyright:

Available Formats

Last Edition: 5/12/2012 12:26:49 AM

Buku Panduan Kajian Saintifik, Statistik dan Pengenalan SPSS

Hakcipata terpelihara. 2012, Suhazeli MD.

Buku Panduan Kajian Saintifik, Statistik dan Pengenalan SPSS

Mukasurat vi vii vii viii 1

Buku Panduan Kajian Saintifik, Statistik dan Pengenalan SPSS

Sepatah Kata dari Penulis

Buku Panduan Kajian Saintifik, Statistik dan Pengenalan SPSS

Kata Aluan Untuk Cetakan Ke-lapan

Buku Panduan Kajian Saintifik, Statistik dan Pengenalan SPSS

Kandungan Cakera Padat (DVD)

Buku Panduan Kajian Saintifik, Statistik dan Pengenalan SPSS

Definisi Perkataan Statistik

14. Null Hipotesis

19. Standard Deviation and standard error of means

Buku Panduan Kajian Saintifik, Statistik dan Pengenalan SPSS

Beberapa Pengukuran Statistik.

Buku Panduan Kajian Saintifik, Statistik dan Pengenalan SPSS

Hubung kait Insiden dan prevalen.

Buku Panduan Kajian Saintifik, Statistik dan Pengenalan SPSS

Kadar insiden yang biasa diperhatikan

Infant mortality rate

Perinatal mortality rate

2. Membanding kadar penyakit dan mengukur hubungkait (association)

Buku Panduan Kajian Saintifik, Statistik dan Pengenalan SPSS

Buku Panduan Kajian Saintifik, Statistik dan Pengenalan SPSS

Buku Panduan Kajian Saintifik, Statistik dan Pengenalan SPSS

Buku Panduan Kajian Saintifik, Statistik dan Pengenalan SPSS

Age (years) 35-44 45-54 55-64 Total

Buku Panduan Kajian Saintifik, Statistik dan Pengenalan SPSS

Buku Panduan Kajian Saintifik, Statistik dan Pengenalan SPSS

Measurement error and bias

Buku Panduan Kajian Saintifik, Statistik dan Pengenalan SPSS

Buku Panduan Kajian Saintifik, Statistik dan Pengenalan SPSS

Buku Panduan Kajian Saintifik, Statistik dan Pengenalan SPSS

Total test positives = (a + b) Total test negatives = (c + d) Grand total = (a + b + c + d)

1. Sensitive or specific? A matter of choice

Buku Panduan Kajian Saintifik, Statistik dan Pengenalan SPSS

Jenis-jenis Kajian Statistik dan Design Kajian

Randomized Controlled Clinical Trial (RCT):

Buku Panduan Kajian Saintifik, Statistik dan Pengenalan SPSS

Randomized Cross-Over Clinical Trial:

Randomized Controlled Laboratory Study:

Buku Panduan Kajian Saintifik, Statistik dan Pengenalan SPSS

Cohort (Incidence, Longitudinal Study) Study:

Buku Panduan Kajian Saintifik, Statistik dan Pengenalan SPSS

Ecologic (Aggregate) Study:

Cross-Sectional (Prevalence Study) Study:

Buku Panduan Kajian Saintifik, Statistik dan Pengenalan SPSS

Buku Panduan Kajian Saintifik, Statistik dan Pengenalan SPSS

BAB III. Membuat Kajian Saintifik

2. Classification of research objectives

Buku Panduan Kajian Saintifik, Statistik dan Pengenalan SPSS

How to search meterials

1. Buku atau majalah

Laman web yang biasa dilawati untuk mendapat bahan

Buku Panduan Kajian Saintifik, Statistik dan Pengenalan SPSS

http://www.nzgg.org.nz/library.cfm http://www.globalfamilydoctor.com/ http://www.mma.org.my/

http://www2.jaring.my/pcdom/ http://medicine.ucsf.edu/resources/guidelines/indexalp ha.html/ http://www.sign.ac.uk/ http://www.postgradmed.com/

http://www.mayoclinic.com/index.cfm/ http://www.acadmed.org.my/ http://mimed.cjb.net/