Article
ODM Clinical Data Analysis – a Tool for the Automatic Generation of Generic Descriptive Statistics
Search Medline for
Authors
Published: | August 29, 2017 |
---|
Outline
Text
Introduction: A required step for presenting the results of clinical trials (CTs) is the declaration of participants’ demographic and baseline characteristics as it is claimed by the FDAAA 801 [1]. The common workflow accomplishing this task is to export clinical data from the electronic data capture system (EDC) and perform the statistical analysis in software such as SAS [2] or IBM SPSS [3]. This commercial software requires trained users and is thus not suitable for local physicians or even too expensive for small CTs. The objective of this work is to present an open source application for automatic analyses of arbitrary clinical data. It can be used to gain an overview of collected data for monitoring purposes and to generate descriptive statistics, e.g., for reports or publications.
Methods: CDISCs XML-based Operational Data Model (ODM) [4] is the transfer format of choice, since most EDCs are supporting this export format [5]. As an advantage of ODM, it contains both meta-data needed for the analysis, e.g., which study events and case report forms (CRFs) have been used, and associated clinical data of each subject. ODM Clinical Data Analysis is implemented as a web application and provided as Docker image [6] under GPL license. As programming language Java is used in combination with state-of-the-art web development libraries, while Apache Tomcat [7] is used as servlet container. To overcome heap overflows in the Java Virtual Machine the database H2 [8] is used as intermediate stage for the calculations.
Results: After uploading an ODM file, the analysis is performed without further user interactions. The generated statistics are presented in a clear web-interface containing the quantity of completed CRFs, participants, and each completed data item. Depending on the data type of each item, different statistics and illustrative charts are provided. Five categories of data types are distinguished which are oriented on the scales of measurement by Stevens [9]. As an example, the descriptive statistics of a float item contain its range, median, expected value and standard deviation, while a histogram illustrates the item’s distribution. The entire analysis can be exported as PDF for further examination.
On commodity hardware the analysis of an 85 MB file with over 3500 forms and 2610 participants took approximately 5 minutes, which includes the creation of the PDF. The current Docker image can be obtained from https://hub.docker.com/r/wwuimi/odmauswertung/.
Discussion: The tool has been applied to several smaller and larger internal studies where its functionality has been proven and a medical expert has validated its usefulness. Nevertheless, repeating forms or groups of items that generate multiple values for a single study subject may influence the statistics validity. Since there is no general answer to this problem, the user is visually warned about the appearance and has to consider the associated results with caution. For future work we are planning to support more features of the ODM standard as well as CDASH [10] and SDTM [11] whose additional semantic information can be used to generate more specific statistics for annotated items.
Die Autoren geben an, dass kein Interessenkonflikt besteht.
Die Autoren geben an, dass kein Ethikvotum erforderlich ist.
References
- 1.
- Section 801 of the Food and Drug Administration Amendments Act of 2007, Sec. 282(j)(3)(C)(i)
- 2.
- SAS Institute. SAS Software. Cary, North Carolina, USA; 2017 [Accessed 31 May 2017]. https://www.sas.com/
- 3.
- IBM. SPSS. Armonk, New York, USA; 2017 [Accessed 31 May 2017]. https://www.ibm.com/analytics/us/en/technology/spss/
- 4.
- Clinical Data Interchange Standards Consortium. Operational Data Model (ODM). [Accessed 31 May 2017]. https://www.cdisc.org/standards/transport/odm, Accessed 05 May 2017.
- 5.
- Hume S, Aerts J, Sarnikar S, Huser V. Current applications and future directions for the CDISC Operational Data Model standard: A methodological review. Journal of biomedical informatics. 2016;60:352-362.
- 6.
- Docker Inc. Docker. Version 17.05.0. San Francisco, California, USA; 2017 [Accessed 31 May 2017]. https://www.docker.com/
- 7.
- Apache Software Foundation. Apache Tomcat. Version 8.5.11. Forest Hill, Maryland, USA; 2017 [Accessed 31 May 2017]. http://tomcat.apache.org/
- 8.
- Müller T. H2 Database Engine. Version 1.4.194. 2017 [Accessed 31 May 2017]. http://www.h2database.com/html/main.html. Accessed 31 May 2017.
- 9.
- Stevens SS. On the theory of scales of measurement. 1946.
- 10.
- Clinical Data Interchange Standards Consortium. Clinical Data Acquisition Standards Harmonization (CDASH). [Accessed 31 May 2017] https://www.cdisc.org/standards/foundational/cdash
- 11.
- Clinical Data Interchange Standards Consortium. Study Data Tabulation Model (SDTM). [Accessed 31 May 2017] https://www.cdisc.org/standards/foundational/sdtm