The First International Proficiency Testing Conference Sinaia, România th th October, 7 SISLI ETFAL RESEARCH AND TRAINING HOSPITAL ENDOCRINOLOGY PROFICIENCY TESTING SCHEME: FIRST NATIONAL ENDOCRINOLOGY PROFICIENCY TESTING IN TURKEY: A PILOT PROJECT Berna Aslan, Nezaket Eren Nihal Yücel, Ayşen Yaral, Leyla Koç Öztürk Şişli Etfal Research and Training Hospital, Biochemistry laboratory, 46 ŞİŞLİ-İSTANBUL Marmara University, Faculty of Dentistry, Department of Biochemistry, NİŞANTAŞI-İSTANBUL aslan_berna@hotmail.com, nezaketeren@yahoo.co.uk, nihal.yucel@turk.net Abstract In our country there are few Proficiency Testing providers in the field of medical laboratory. There is not any national PT provider who offers endocrinology tests in Turkey. By establishing a PT scheme in the fields of endocrinology, we tried to fulfill the need in the field. In addition to answer the requisition in the market we also would like to use this scheme to give practical training about different aspects of PT s to our residents. They all will be prospective participants of the PT schemes and some of them will be providers. We compared the thyroid stimulating hormone, thyroxin, free thyroxin, tri-iodotyronin and free tri-iodotyronin results of the participant laboratories. A series of two samples constituted with pooled human serum and analytes at clinically relevant concentrations are prepared for each PT event. The PT specimens are subsequently verified, sterile-filtered (. µm pore size), and aliquot. Aliquots were stored frozen at -8 C until lyophilization (freeze drying) procedure. The variability of the test materials received by the participants should be small. For these purpose analytical outliers were checked and deleted from the data by using Cochran s test then one-way analysis of variance is carried out. The samples were sufficiently homogeneous. Consensus of Participants Results approach is used for determining the assigned value for a test material. Firstly, clearly spurious results are excluded to prevent the reporting errors. Then, the median/ robust standard deviation method is used to calculate the assigned value and robust standard deviation. Acceptable ranges are established using fixed criteria specified by CLIA '88 Minitab statistical software and MS Excel are used for statistical evaluation. 44
Key words Proficiency Tests, Interlaboratory Comparisons, External quality assessment schemes, Robust statistics. INTRODUCTION Quality improvements in the modern clinical laboratory require the continuous inspection of the laboratory s performances and refinement of the procedures being used so their services can meet the needs and expectations of the users. Laboratories which participate proficiency testing schemes obtain an objective tool for assessing and the demonstrating the reliability of the data they are producing []. Although there are various types of proficiency testing schemes, most of them compare the laboratory test results of two or more laboratory. In our country, there are few PT schemes in the field of clinical laboratory, but any of them coordinates PT schemes for endocrinology tests. Big laboratories supply their proficiency tests from foreign suppliers. This condition carry the some drawbacks: Firstly, PT schemes provided by the foreign suppliers are very expensive and they do not reach the small laboratories in the country. Secondly, they do not compare the results of our country, so they are very weak to show the regional problems. By establishing a PT scheme in the fields of endocrinology, we tried to meet the need in the field. In addition to answer the requisition in the market we also would like to use this scheme to give practical training about different aspects of PT s to our residents. They all will be prospective participants of the PT schemes and some of them will be providers. We compared the thyroid stimulating hormone (TSH), thyroxine (T4), free thyroxine (FT4), tri-iodotyronin (T) and free tri-iodotyronin (FT) results of the participant laboratories. EXPERIMENTAL DESIGN. Preparation of Samples A series of two samples constituted with pooled human serum and analytes at clinically relevant concentrations were prepared for the PT event: Sample A, and sample B. We prepared human serum pool by using patient samples accepted for HBsAg, Anti-HCV and HIV tests in our Hospitals Microbiology laboratory. Samples which are negative for HBsAg, Anti-HCV and HIV tests were used for preparing human serum pool. We mixed the pool gently, and then we divide it into two parts. In order to prepare high level sample, patients plasmas with high hormone level were used. After that two pools we obtained were processed separately. We filtered both pools by using. µm pore size filters to prevent the microorganisms in the samples that may deteriorate the components of the sample. After preparing the filtered pools, we separated them ml aliquots. Aliquots were stored frozen at -8 C until lyophilization (freeze drying) procedure. No preservatives or stabilizers are added to the PT samples to eliminate the potential interference from the additives. Samples were labeled after freeze drying. QC samples were stored in -8 o C after lyophilization. 45
We performed homogeneity test as explained in the ISO guide 8 []. Samples distributed to 4 clinical chemistry laboratories in İstanbul. Transportation took maximum -4 hours.. Participants This was the first round of the scheme and it is the closed one. Participants were invited by phone calls and were informed about the scheme. Than documents about the scheme and result return sheets were prepared and sent with samples to the participants who would like to attend. 4 laboratories attended to this first round of the scheme; of them were in University Hospitals, 7 were in Research and Training hospitals, 5 were in State Hospitals, were in private hospitals.. Schedule of PT Scheme Three PT events were scheduled annually: April, August and December. A series of two samples is sent for each round. Participants report their results in a week after they receive the samples.4 Statistical analysis We evaluated homogeneity of the PT samples by using Cocran s test and one way ANOVA. For evaluating test results of the participant laboratories we used robust mean and robust standard deviation. Which are statistical techniques recommended in The International Harmonized Protocol for the Proficiency Testing of (Chemical) Analytical Laboratories [, 4]. RESULTS. Checking samples for homogeneity Any conditions relating to the samples affect the outcome of the PT scheme. Even if efforts to ensure homogeneity, samples prepared for PT schemes are usually heterogeneous to some degree. Aliquots of the bulk material which is distributed to participant laboratories show slight variation in composition among themselves [6]. Procedure recommended in ISO 58 [] and IUPAC Harmonized Protocol Appendix [6] for checking samples homogeneity aims to show if this variation is sufficiently small for the purpose. Sufficient homogeneity in PT samples is that this variation in composition among the distributed units (sampling standard deviation, σ sam ) should be small enough to be meaningless in relation to the variation introduced by the measurements conducted by the participants in the proficiency test (the standard deviation for proficiency assessment σ p ) [,, 5]. The 99 Harmonized protocol [7] required that σ sam, should be smaller than, σ p. In our study we choose samples in random order and analyzed TSH levels twice in random order again. By using Cochran s test, we investigated analytical outliers. There was not any analytical outlier. We performed one-way analysis of variance (ANOVA) as explained in the protocol by using Excel software. Then we calculated the σ sam compared it with σ p. We founded that σ sam was, and σ p was, in Sample A and because,<,*, it was sufficiently homogeneous for our 46
purpose. In sample B, σ sam was,89 and σ p was,55. Because,89<,*,55 it was sufficiently homogeneous for our scheme. The results obtained in Sample A and B homogeneity test are shown in figure. Homogeineity Data For TSH in Sample A TSH(U/L),8,8,78,76,74,7,7,68,66 4 5 6 7 8 9 Aliquot number σ an =,; σ sam =,; σ p =,; σ sam =,9 Critical value(c) =,4 c> σ sam Homogeneity Data for TSH in Sample B TSH (U/L),6,4,,8,6,4, 4 5 6 7 8 9 Aliquot number σ an =,4; σ sam =,89; σ p =,55; σ sam =,8 Critical value(c) =,48 c> σ sam Figure -Results of homogeneity tests (σ an, analytic variation; σ sam sampling σ p, standard deviation; standard deviation for proficiency assessment). Evaluation of the participant Laboratories results We first checked the frequency distribution of the series. By using Minitab software we determine whether the distributions fit normal distribution by using Anderson Daring test for normality. All the distributions except the one of the sample B FT, showed normal distribution. Secondly, we calculate the arithmetic mean, standard deviation, uncertainty of arithmetic mean, robust, robust standard deviation, and uncertainty of robust mean by using Microsoft Excel. Consensus of Participants Results approach is used for determining the assigned value for a test material. The influences of the outliers are minimized by the use of robust statistical procedures [,, ]. Results are shown in table. 47
Table -. Summary of the results of the first round of the pilot study. Number of Arithmetic Participants Standard Deviation Uncertainty of Arithmetic Robust Robust Standard Deviation CV % Uncertainty of Robust Sample A TSH (U/L) 4,669,5,,655, 8,7 T4 (µg/dl) 9,7,4,8 9,,45 6, T (ng/ml),,4,48,8,85 6,4 FT4 (ng/dl) 4,94,5,5,95,7 7,48 FT (pg/ml) 4,957,6,,94,46 5,89 Sample B TSH 4,6,84,74,95,556 7,8 T4 (µg/dl) 8,64,9,9 8,79,877,8 T (ng/ml),,4,5,5,78 5,8 FT4 (ng/dl) 4,5,88,8,55,6 4, FT (pg/ml) 4,86,65,,79,45 9,5 Finally, z-scores were calculated by using robust standard deviation. The z-scores were estimated by dividing the difference between each laboratory result and target robust mean by robust standard deviation ((Laboratory result Target Robust )/robust standard deviation). The z- score values outside the ± standard deviation (SD) were evaluated as a procedure that needed to be investigated. These were considered unsatisfactory, while the values inside the ± SD limits were considered satisfactory, and z-scores outside the ± SD limit but inside the ± SD limits were considered questionable. The z-scores of the participant laboratories for TSH, FT4, FT, T4 and T obtained in Sample A and B are shown in Figure. For TSH only one laboratory got the z- scores lower than ± SD for both sample A and B. All the other laboratories got the z- scores in the range of ± SD and their results considered satisfactory. This laboratory s results were also outside the limits permitted by the CLIA 88 for TSH (Target ± SD). Laboratories got performance score of satisfactory for FT4 except two laboratories whose performance scores were questionable. One laboratory obtained z-scores higher than ± SD. The other had z-scores between ± ±and ± SD in sample B while the z- scores for the sample A were inside the ± SD limits. Results of all the laboratories were inside the limits stated by the CLIA 88 (Target ± SD) [7]. We found two laboratories whose T4 results were outside the ± SD limit, but inside the ± SD limits for sample B. z-scores of these laboratories in the sample A were inside the ± SD limits as all the other laboratories results were. These results in sample B were also outside of the CLIA 88 limits for T4 (Target mean ± % or. µg/dl) [7]. 48
TSH-A and TSH-B SCATTERPLOT Scatterplot of T4 -A vst4-b - - - - - - TSH-(SAMPLE A) z-score - - - - - - T4 (SAMPLE A) z-score - - - - -4-4 - - - TSH-(SAMPLE B) z-score - - - - T4 (SAMPLE B) z-score - Scatterplot of FT4 -A vs FT4-B Scatterplot of FT -A vs FT-B FT4 (SAMPLE B) z-score - - - - - - - T (SAMPLE A) z-score 5 4 - - - - - - - - - - - - - - FT4 (SAMPLE B) z-score -5, -,5,,5 FT (SAMPLE B) z-score 5, 7,5 Scatterplot of T -A vst-b - - - T (SAMPLE A) z-score - - - - - - - - - T (SA MPLE B) z-score 4 Figure - z-scores of the participant laboratories Frequency distributions of FT results obtained in sample B did not fit normal distribution (figure ). There was one laboratory finding the higher FT levels and having z-sores higher than ± SD in both samples. In addition, there was one laboratory whose z-score for sample B was lower than ± SD and between ± and ± SD for sample A. Other two laboratories found questionable results at least in one sample. of the laboratories sent results within the ± SD limits. When we compare the results with limits stated by the CLIA 88, we found all laboratories having questionable and unsatisfactory z-scores, also failed for CLIA 88 [7] criteria (Target ± 5% or.5 pg/ml). 49
laboratories sent results for T, of them were inside ± standard deviation limits, while one of the remaining laboratory had z-scores greater than SD for sample B and greater than SD for sample A, other one had z-score inside ± SD limit for sample B and outside the SD for Sample A. Only one laboratory sent a result failed according to CLIA criteria [7, 8].for T (Target ± SD) Summary for FT-A A nderson-darling Normality Test A-Squared,55 P-V alue,4,958 StDev,68 V ariance,6 Skew ness,995 Kurtosis,8689 N 4,,5,,5 4, 4,5 5, Minimum,9 st Q uartile,585 Median,94 rd Q uartile, Maximum 4,78 95% C onfidence Interv al for,697,45 95% C onfidence Interv al for Median,744,69 95% Confidence Intervals 95% C onfidence Interv al for StDev,4669,848 Median,7,8,9,,, Summary for FT-B A nderson-darling Normality Test A-Squared,98 P-V alue,,79 StDev,565 V ariance,77 Skew ness,665 Kurtosis 4,478 N 4,6,4, 4, Minimum,48 st Q uartile,5575 Median,79 rd Q uartile,94 Maximum 4,4 95% C onfidence Interv al for,57,57 95% C onfidence Interv al for Median,65,969 95% Confidence Intervals 95% C onfidence Interv al for StDev,49,786 Median,6,7,8,9, Figure - Graphical Summary of Basic Statistics of Sample A and Sample B 5
4 DISCUSSION Proficiency testing (PT) schemes are integral parts of the laboratory quality assessments. These schemes provide very useful information about the technical performance of the laboratories including validity of analytical measurements. A PT scheme coordinates interlaboratory comparisons to evaluate the laboratories performances externally and objectively. By doing this, it supplies a very convenient tool for the laboratory to see their performances and the results of their corrective actions. Because accreditation bodies recommend that clinical laboratories that seek accreditation according to ISO 589 [8] to participate an appropriate PT schemes, growing number of laboratories would like to attend PT schemes. This pilot project aims to create a PT scheme that the participants around the country can easily reach. PT organizers distribute portions of a homogeneous material to their participants, collect the test results reports from them, perform statistical analysis and produce reports to inform the participants about their laboratory performances. Data produced by PT scheme can be crucial for many users and lead to misinterpretation. So the PT provider has to create the report formats which are easily interpreted by the users. For this purpose we designed reports containing graphical summary of the all results and showed the places of the user on it. Thus, the participants can see their places among the others visually. We also give number of participants, arithmetic mean, standard deviation, robust mean, robust standard deviation and performance scores of the laboratories calculated by using robust standard deviation in their own languages. We prepared PT specimens in our laboratory, we performed homogeneity tests, and we found that they were sufficiently homogeneous for our purpose, the cost were lower than that of the foreign suppliers. Although, all the distributions obtained in the scheme fit normal distribution except FT in sample B, we calculated performance scores of the participants by using robust mean to be able to eliminate the effects of the outliers. We also evaluated the participant results according to the criteria stated by CLIA 88. Because CLIA criteria for the TSH, FT4 and T is Target ± SD, results evaluated as questionable according to z-scores became satisfactory when CLIA rules were applied. For FT, we found all laboratories having questionable and unsatisfactory z-scores, also failed for CLIA 88 criteria (Target ± 5% or.5 pg/mol). We observed that two laboratories T4 results which were outside the ± SD limit, but inside the ± SD limits for sample B, were also outside of the CLIA 88 limits for T4 (Target mean ± % or. µg/do). These observations showed that performance scores produced in our schemes could catch the results that were outside the CLIA [7] limits. 5 CONCLUSIONS As a result of this pilot project, we coordinate endocrinology proficiency test first time in our country. We produced proficiency test samples in our laboratory with a very affordable cost. Participants of the first round were the medium and big laboratories in İstanbul, because the first round was a closed one. But the scheme was also affordable for small laboratories that perform thyroid tests. 5
REFERENCES [] ISO Guide 4-:997 Proficiency testing by interlaboratory comparisons - Part : Development and operation of proficiency testing schemes [] ISO/FDIS 58:5. Statistical methods for use in proficiency testing by interlaboratory comparisons M. Thompson, S.L.R. Ellison, R. Wood. The International Harmonized Protocol For The Proficiency Testing Of Analytical Chemistry Laboratories (IUPAC Technical Report). (Revised 6) [] IUPAC 6 (M. Thompson and R. Wood). The International Harmonised Protocol for the Proficiency Testing of (Chemical) Analytical Laboratories. Pure Appl. Chem. 78, 45-96 [4] IUPAC 99 (M. Thompson and R. Wood). The International Harmonised Protocol for the Proficiency Testing of (Chemical) Analytical Laboratories. Pure Appl. Chem. 65, -44 [5] Robust statistics: a method of coping with outliers, Analytical Methods Committee, Royal Society of Chemistry, No:6, April [6] ISO Guide 4-:997 Proficiency testing by interlaboratory comparisons - Part : Selection and use of proficiency testing schemes by laboratory accreditation bodies [7] Center for Disease Control and Prevention, Current CLIA Regulations (including all changes through //4, Part 49 (Laboratory Requirements), subpart I (Proficiency Testing Programs for Non-waived Testing), section 49.9 Endocrinology, page 95, (www.phppo.cdc.gov/clia/regs/subpart_i.aspx#49.9) [8] ISO 589: specifies requirements for quality and competence particular to medical laboratories 5