Estimated date of delivery with electronic medical records by a hybrid GBDT-GRU model (2024)

Introduction

Accurate estimated date of delivery (EDD) is helpful for pregnancy outcomes and clinical decisions making1, including diagnosing preterm and full-term, formulating measures for fetal dysplasia, arranging the timing of prenatal examination, preparing nursing measures for parturition and improving the efficiency of delivery. A reliable EDD is very important to reduce the occurrence of premature or postmature babies and is critical for both short-term and long-term health outcomes in neonates. Inaccurate EDD may have adverse effects on the health and safety of pregnant women and fetuses.

The current clinical method of determining the EDD is based on the information about last menstrual period (LMP) and ultrasound2,3,4. Among them, the Naegele’s rule based on LMP is the most common and wide method to calculate the EDD5. The Naegele’s rule is calculated by adding seven days and nine months to the first day of the LMP, or the EDD is 280days after the first day of the LMP. However, the limitations of LMP include deviations in recalling the last menstruation, irregular menstrual cycles, oral contraceptives and early pregnancy bleeding6. In several studies, calculating EDD by ultrasound of the first trimester of pregnancy is more accurate than the LMP7,8. The research of Kessler et al.9 aimed at assessing the actual pregnancy length and accuracy of EDD prediction based on fetal head circumference measured at the second trimester, a population-based validation of 21,451 deliveries showed measurements can be safely used to predict EDD. Majola et al.10 compared the accuracy of LMP recall and an early ultrasound (EUS) in predicting the EDD in South African pregnant women, the results show that the effect of using EUS to calculate EDD is obviously better. However, the effect of predicting EDD only using ultrasonic features is not strongly helpful11. Interestingly, EDD is also affected by other factors. Obviously, pre pregnancy weight and maternal age are also important factors affecting EDD12,13. According to Staneva14, experiencing psychological distress such as depression, anxiety, and/or perceived stress during pregnancy may increase the risk of preterm birth. In addition, the life and behavior habits during pregnancy also affect EDD15,16. Besides, some studies believed that the accuracy of EDD will gradually decrease with the increase of gestational weeks17. Therefore, the EDD should be determined once we obtain the data from LMP or the first accurate ultrasound examination18,19. However, some studies showed that only about 5% of births are born exactly on EDD, regardless of the LMP methods or ultrasound20.

In the study of medical prediction methods, machine learning models are widely used since its high accuracy and high efficiency. Liang et al.21 used a linear regression model to find the blood metabolites that can predict gestational age and delivery date accurately. Heuvel et al.22 used convolution neural network (CNN) to estimate fetal head circumference, determine gestational age and delivery date. Fung et al.23 developed a machine learning approach based on ultrasound-derived and fetal biometric data to estimate gestational age and delivery date, but this article did not mention the type of ML. Schink et al.24 developed an algorithm to estimate the beginning of pregnancy in German claims data focusing on the potential of the expected delivery date. Torres et al.25 designed a system to calculate the gestational age and delivery date. They used images from the feet, face and ear of 130 newborn babies and a combination of fully convolutional networks, CNN and support vector regressors (SVR). Kojita et al.26 used fetal MRI in early pregnancy to predict gestational age and EDD. However, the above-mentioned prediction models ignored the effect of time series factors. Accurate EDD needs to evaluate the physical condition of pregnant women, and analyze the recent trend by judging the fetal development status of pregnant women at all stages. Since the data of antenatal examination is time series data, the EDD is closely related to the results of each examination.

Therefore, this study attempted to combine prenatal examination with electronic medical records to establish a hybrid time series model based on Gradient Boosting Decision Tree and Gated Recurrent Unit (GBDT–GRU) to predict the expected date of delivery.

Methods

Framework for the estimated date of delivery

This study aimed to predict the EDD by using a hybrid model of GBDT and GRU. GBDT-GRU model made more effective and reasonable decisions by obtaining information from experience and mining hidden knowledge in data. The block diagram of the prediction process is shown in Fig.1 and the detailed explanations of each step are as follows:

GBDT-GRU framework for the estimated date of delivery.

Full size image

Step 1: Data preparation. The original maternal data obtained from EMR was processed with data cleaning and data transformation. Considering the different physiological characteristics during different periods of pregnancy, the processed data was divided into two datasets: dataset of the first trimester of pregnancy, and dataset of the second and third trimesters of pregnancy.

Step 2: Feature selection. Important features were selected in each dataset by ranking all feature importance of GBDT model, therefore avoiding the problem of information redundancy and reducing the dimension of data.

Step 3: EDD prediction using GBDT-GRU model. A preliminary prediction of EDD was achieved based on data of the first trimester with GBDT model, and considered as a new feature for the prediction of a more precise EDD. The new feature was combined with the original features of the second and third trimesters of pregnancy, and feed into a GRU model to generate the final EDD.

Step 4: Results evaluation. The prediction results of EDD with GBDT only, GBDT-GRU hybrid model and other methods were evaluated and compared.

Datasets preparation

In this study, the data were collected from the electronic medical records (EMR) of pregnant women in a maternity hospital in eastern part of China. We extracted the physical examination and ultrasound records of the pregnant women who natural and full-term delivery between 2017 and 2020. The information was processed in such a way that the individual could not be identified. This study was approved by the ethics committee of Hangzhou Women’s Hospital and performed in accordance with the Declaration of Helsinki (written permission with approval NO. 2019-02-2). Considering the necessity of predict EDD in advance, only the physical examination records before 35weeks were used in this study. The count of physical examinations of pregnant women in different gestational weeks is shown in Fig.2. According to Fig.2, there are too few pregnancy examinations between 13 and 22weeks. In addition to the frequency of pregnancy examination, pregnant women have different examination items at different stages of pregnancy. Some ultrasound indicators only appear in the first trimester of pregnancy and will disappear with the increase of pregnancy weeks, such as the gestational sac (e. g, Features of the pregnancy examination Table 1). Therefore, we divided the dataset into two subsets according to the time of pregnancy examination: the first trimester of pregnancy dataset (pregnant week: 4 to less than 14weeks); the second and third trimester of pregnancy dataset (pregnant week: 23 to less than or equal 35weeks).

Count of the pregnant women in different pregnant weeks.

Full size image
Full size table

Due to the variability and irregularity of pregnancy examination dates, some samples will be lost. In this work, we deleted the samples that lacked key features. For example, a sample only has basic features such as height and weight, but it lacks all important features such as gestational sac size, FAC, HC and so on. Moreover, samples with more than 50% missing values were excluded from further analysis. The antenatal examination data is time series data and linear interpolation is suitable for missing value filling of time series data. The missing values of our data were filled by linear interpolation according to the time of two adjacent pregnancy examinations. The gap between the values of variables, resulting from the different dimensions and dimensional units of variables, could affect the performance of the model. Therefore, it was necessary to normalize the data to avoid the influence of the larger range of values on other features and improve the convergence speed of the model. The min–max normalization is used to scale the values of the result to range [0,1], which is represented in Eq.(1) as:

$${x}^{*}=\frac{x-{x}_{min}}{{x}_{max}-{x}_{min}}$$

(1)

where \(x\) is the current eigenvalue, \({x}_{min}\) and \({x}_{max}\) are the minimum and maximum values of the current feature, and \({x}^{*}\) is the standardized eigenvalue. The prediction results generated by the model need to be further denormalized as shown in Eq.(2), where \(y\) is the true value and \({y}_{predict}\) is the predicted value.

$$y={y}_{predict}*\left({x}_{max}-{x}_{min}\right)+{x}_{min}$$

(2)

Feature selection

GBDT27 is a boosting algorithm based on the classification and regression tree (CART). In this study, we used GBDT model for feature importance analysis and selection in the two datasets. The feature selection of GBDT is based on calculating the gain of the split nodes of the decision tree and using cumulative summation to evaluate the appropriateness of features28. The importance of a feature is measured by calculating the average importance of a feature in a single tree. GBDT uses formula (3) as a measure of influence, \({\widehat{I}}_{j}\) is the relative influence.

$${{\widehat{I}}_{j}}^{2}=\frac{1}{M}\sum_{m=1}^{M}{{\widehat{I}}_{j}}^{2}\left({T}_{m}\right)$$

(3)

where \({{\{T}_{m}\}}_{1}^{M}\) means a collection of decision tree, \(M\) represents the number of trees. The importance of feature \(j\) in a tree is calculated according to the formula (4):

$${{\widehat{I}}_{j}}^{2}\left(T\right)=\sum_{t=1}^{J-1}{{\widehat{i}}_{t}}^{2}1\left({v}_{t}=j\right)$$

(4)

where \({{\widehat{i}}_{t}}^{2}\) represents the squared loss, \({v}_{t}\) means a feature associated with \(j\) nodes, and \(J-1\) is the number of non-leaf node.

In the feature selection process, we generated the feature weights group \({{W=\{w}_{1} ,w}_{2},\dots {,w}_{n}\}\) from prenatal examination datasets and selection results of GBDT model, where \({w}_{i}\) describes the weight of each feature. Feature selection was performed on the two subsets based on the contribution degree of each feature. In this paper, we added the features one by one according to the weights from high to low, and selected the features used in this experiment by comparing the error and running time.

Hybrid GBDT-GRU model

Since our aim was to predict the remaining days of pregnancy, the uncertainty of the future of pregnancy and the strict requirement of accuracy was really challenging. We designed a hybrid GBDT-GRU model and the structure was shown in Fig.3. GBDT model is a kind of boosting algorithm, which belongs to the category of ensemble learning29. Among the machine learning methods used in practice, GBDT runs faster when training large amounts of data and have stronger robustness when processing outlier value. In this study, we used the GBDT model for the first prediction with the dataset of first trimester of pregnancy, then took the predicted results as the initial EDD. As a new feature, the initial EDD was fused with the second and third trimester of pregnancy dataset to obtain a fused dataset.

Structure of GBDT-GRU.

Full size image

GRU30 and LSTM31,32 are variants of Recurrent Neural Network (RNN)33, they are proposed to solve the gradient disappearance and gradient explosion problems of traditional RNN in the process of long sequence training34. Different from LSTM, GRU only includes two gates: update gate and reset gate. The simplified structure enables GRU to effectively reduce the running time on the premise of ensuring the prediction accuracy. With the design of update gate and reset gate, GRU model can handle the time series data as well. The input layer of GRU is the time series from fused dataset, which can be noted as \({X=\{x}_{1}{,x}_{2},\dots , {x}_{t}\}\), where \({x}_{i}\) represents the record of the \(i\)th physical examination of pregnancy women. The hidden state \({h}_{t-1}\) contains the information of the previous node. Where \({z}_{t}\) and \({r}_{t}\) denote the update gate and reset gate, respectively. \(Wr\) and \(Wz\) are the weight matrices from hidden states at previous time step to the update gate and reset gate, respectively. \(\sigma\) is a sigmoid function. The formula is expressed as follows:

$${r}_{t}=\sigma \left(Wr\cdot \left[{h}_{t-1 }{,x}_{t}\right]\right)$$

(5)

$${z}_{t}=\sigma \left(Wz\cdot \left[{h}_{t-1 }{,x}_{t}\right]\right)$$

(6)

The reset data obtained by the reset gate of the hidden layer data at the final moment is combined with the current input \({x}_{t}\), and \(tanh\) is the activation function. The activation state of the hidden layer at the current moment \({\tilde{h }}_{t}\) can be defined as:

$${\tilde{h }}_{t}=tanh\left({W}_{\tilde{h }}\cdot \left[{r}_{t}*{h}_{t-1},{x}_{t}\right]\right)$$

(7)

Then the same gate \({z}_{t}\) is used to select and forget memory, and the hidden state \({h}_{t}\) of time \(t\) can be calculated by:

$${h}_{t}=\left(1-{z}_{t}\right)*{h}_{t-1}+{z}_{t}*{\tilde{h }}_{t}$$

(8)

Finally, we used the result of the last moment of the output layer as the final EDD. Meanwhile, the final EDD could predict more accurately than initial EDD.

The parameters of these prediction models were determined by grid search and the models were validated with fivefold cross-validation. The grid search method combined all possible parameters, then trained each group of parameters to find the best combination of parameters. After five-fold cross-validation, the hyperparameter combination with the highest average score was taken as the best choice, and the model object was returned. The data used in the experiment included the records of multiple antenatal examinations of 5537 pregnant women, and dataset was divided into two sections, where 80% of dataset is used for training and the remaining 20% of dataset is used for testing. GRU model is composed of an input layer, a hidden layer and an output layer. During GRU training, the antenatal examination data are transformed into a three-dimensional matrix with a matrix size of (4430, X, Y), in which 4430 represents the number of pregnant women, X represents the number of antenatal examinations of pregnant women, and Y represents the number of characteristics of each antenatal examination of pregnant women. GRU model uses Adam optimizer to optimize the training process. This experiment is a regression task, so GBDT uses the mean square error “Ls” as the loss function of the algorithm. Table 2 shows the parameter settings of GBDT and GRU models.

Full size table

Evaluation methodology

The prediction errors were considered as an essential factor to evaluate the proposed model. In this study, the coefficient of determination (R2), Mean Absolute Errors (MAE) and Mean Square Error (MSE) were used as the evaluation indices of the models. The calculation formulas are as follows:

$${R}^{2}=1-\frac{{\sum }_{i}^{n}{\left({\widehat{y}}^{\left(i\right)}-{y}^{\left(i\right)}\right)}^{2}}{{\sum }_{i}^{n}{\left(\overline{y }-{y}^{\left(i\right)}\right)}^{2}}$$

(9)

$$MAE=\frac{1}{m}\sum_{i=1}^{m}\left| {y}^{\left(i\right)}-{\widehat{y}}^{\left(i\right)}\right|$$

(10)

$$MSE=\frac{1}{m}\sum_{i=1}^{m}{({y}^{\left(i\right)}-{\widehat{y}}^{\left(i\right)})}^{2}$$

(11)

where \({y}^{(i)}\) and \({\widehat{y}}^{(i)}\) are the real and predicted values, respectively, and \(\overline{y }\) is the average value of real values.

In order to further assess the effectiveness of prediction based on the GBDT-GRU model, the bias in predicting EDD of each method was used as another critical index of prediction reliability. The \({D}_{bias}\) is defined as formula (12), where \({D}_{real}\) is the actual date of delivery and \({D}_{predict}\) is the EDD.

$${D}_{bias}={|D}_{real}-{D}_{predict}|$$

(12)

By counting the proportion of people under different \({D}_{bias}\), we could get the performance and availability of different methods in practical applications. We calculated the accuracy under specific requirements \({Accuracy}_{bias}\) by formula (13).

$${Accuracy}_{bias}=\frac{{n}_{{D}_{bias}}}{N}*100\mathrm{\%}$$

(13)

where \(N\) is the total number of pregnant women, \({n}_{{D}_{bias}}\) means the number of pregnant women whose prediction bias are less than or equal to \({D}_{bias}\).

Ethics declarations

This study is observational and presents no more than minimal risk of harm to subjects and involves no procedures for which written consent is normally required outside the research context. The study was approved by the ethics committee of Hangzhou Women’s Hospital and performed in accordance with the Declaration of Helsinki (written permission with approval NO. 2019–02-2). The informed consent requirement for this study was waived by the ethics committee of Hangzhou Women’s Hospital. The researcher only accessed the database for analysis purposes, and all pregnant women data have been desensitized during the experiment.

Results

Description of the experimental data

The dataset used in this study comes from a hospital in the eastern part of China, which includes a large amount of data such as the maternal ultrasound records, prenatal examination reports and so on. After data preprocessing, the pregnancy dataset was obtained includes 33,222 pregnancy examination records and ultrasound records of 5537 pregnant women. Table 1 describes the features of the dataset.

Feature importance and feature selection

We used GBDT to selected the features that have vital influences on EDD. The selected features were used as the input of the prediction model, which reduced the dimension of the input and solved the problem of information redundancy. Feature importance reflected the contribution of each variable in EDD. The results for feature selection of different datasets results are shown in Fig.4a, c. The GSL was the most important variable to affect EDD in the first pregnancy dataset, followed by GSW, GSH, P-W, Age, MC and MD (Fig.4a). At the same time, FAC was the feature with the highest weight value in the second and third trimester of pregnancy dataset, followed by FL, HC, BPD, AFI, UH, P-W, HGB, DBP, BLG, BMI, Age and SBP (Fig.4c).

Analysis result for feature selection of different datasets. a shows the feature importance of the first pregnancy. b shows the MAE and running time with different number of features. c represents the feature importance of the second and third trimester of pregnancy. d represents the MAE and running time with different number of features in the second and third trimesters of pregnancy data.

Full size image

We added the features one by one according to the weights from high to low. The corresponding MAE values and running time after training the different number of features with GBDT are shown in Fig.4b, d. We chose the feature group with the shortest running time in the case of the lowest MAE. Finally, seven features were retained in the first pregnancy dataset and 13 features were reserved in the second and third trimesters of pregnancy data. Table 3 shows the summary statistics of these parameters.

Full size table

Evaluation and comparison of different models

In order to effectively evaluate the experimental results of GBDT-GRU model, we compared the prediction results of the Naegele’s rule and some machine learning models. The machine learning models we used for comparison include Random Forest (RF), Support Vector Regression (SVR) and LSTM. RF is a powerful algorithm for classification and regression, the prediction is made by majority vote or averaging the results of the ensemble35,36. SVR37 is an extension of the concept of Support Vector Machine (SVM), which is used for regression purpose. Based on the above-mentioned features in Table 3, we constructed these machine learning models to predict the EDD.

The average values of the results after fivefold cross-validation are shown in Table 4. We provided a performance comparison of the prediction models in different datasets. First, GBDT, RF and SVR were used to predict the initial EDD from the first trimester of pregnancy dataset. Second, the final EDD was gained with the time series model based on the fused dataset. Finally, MSE, R2 and training time were used to compare the prediction results of different models. Table 4 shows that the GBDT-GRU prediction model outperforms Naegele’s rule, all the single models and other hybrid models, achieves average MSE of 41.73 and R2 of 0.84. Moreover, comparing with the hybrid LSTM models, the hybrid GRU models have a shorter training time.

Full size table

According to the difference between EDD and the actual date of delivery, we recorded and compared the accuracy rate of each model under four categories: \({D}_{bias}\) smaller or equal to zero, two, four and six. The accuracy of different methods under different \({D}_{bias}\) is shown in Fig.5.

The accuracy of different methods under different \({D}_{bias}\).

Full size image

As shown in Fig.5, our GBDT-GRU models achieved better prediction results than other methods for different \({D}_{bias}\). The accuracy of EDD by GBDT-GRU model was 6.9%, 32.6%, 53.6% and 69.8%, when \({D}_{bias}\)< = 0, 2, 4 and 6days. Significantly, with the increase of \({D}_{bias}\), the accuracy advantage of GBDT-GRU model is more obvious.

Discussion

In this study, we used a hybrid model of GBDT and GRU to generate features from EMR and to predicted the EDD of pregnant women. The accuracy of the GBDT-GRU model was superior to other prediction methods. In addition, we selected the features that have great influence on the EDD to make the model have better performance of prediction.

The experimental results showed that the performance of hybrid models (GBDT-GRU, GBDT-LSTM, RF-GRU and RF-LSTM) were better than all single models. Hybrid models achieved overall MSE is smaller than 44.12 and R2 is larger than 0.81. This shows that hybrid models have better generalization ability compared to other models for EDD, which may better serve and support the medical staff in decision making. Furthermore, the GRU presented better performance than LSTM when dealing with the time series data, which was benefited by the simpler gates structure of GRU. The GBDT-GRU exhibited the best performance among all models. As far as we know, this study was the first attempt to apply a hybrid model to the data of different stages of pregnancy, which could adjust the EDD according to the characteristics of each period of pregnancy. Therefore, it was obvious that our models were well suitable for the EDD of healthcare service.

As shown in Fig.5, the proposed model not only optimizes the model running time but also improves the prediction accuracy. When \({D}_{bias}\) is less than six days, the accuracy of GBDT-GRU model is 9.1% higher than the Naegele’s rule. In addition, the results of this study were helpful for the EDD and had development of guidelines for clinical delivery treatments.

The clinical research about EDD was still focused on ultrasound and LMP, such as head circumference, cervical length, some improved formula methods38 and so on. These studies provided a reference for feature selection of machine learning. In addition, datasets of EMR provided great potential for EDD in pregnancy. We found that several new features were closely related to childbirth, which could enhance the accuracy of the EDD. The results of our study indicated that days of pregnancy, gestational sac size have a great influence on EDD in the first trimester of pregnancy. And for the second and third trimester of pregnancy, the influence of days of pregnancy, FAC, AFI and BPD were relative important features. Moreover, the importance of features given by GBDT model provides a reference for doctors to pay more attention to the key physiological indicators of pregnant women.

Our study also had several limitations that need to be improved. First, this study only used physical examination data and ultrasound data for prediction. We did not consider the influence of laboratory parameters on EDD. Second, the primary limitation of our study was a possible selection bias due to the center study with small sample size, and its accuracy and practicality should be verified in prospective studies with larger samples.

Conclusions

In this paper, a hybrid model of a GBDT model and GRU model was proposed to predict EDD. For a more accurate EDD, we established a hybrid model of the parameters related to pregnant women and fetal physical examination. The results show that GBDT-GRU achieves a satisfactory outcome in the experiment and the accuracy of the EDD can be improved by adjusting the number of features. Therefore, our hybrid model is an effective method to support clinical decision making and artificial intelligence methods have great application potential in obstetrical practice. Future studies should also solve the problem of predicting the EDD within the scope of preterm delivery.

Data availability

The data that support the findings of this study are available from Hangzhou Women’s Hospital, but restrictions apply to the availability of these data, which were used under license for the current study, and so are not publicly available. Data are however available from the authors upon reasonable request and with permission of Hangzhou Women’s Hospital.

References

  1. Nguyen, T. H., Larsen, T., Engholm, G. & Møller, H. Evaluation of ultrasound-estimated date of delivery in 17 450 spontaneous singleton births: do we need to modify Naegele’s rule?. Ultrasound Obstet. Gynecol. Off. J. Int. Soc. Ultrasound Obstet. Gynecol. 14, 23–28 (1999).

    Article CAS Google Scholar

  2. Barr, W. B. & Pecci, C. C. Last menstrual period versus ultrasound for pregnancy dating. Int. J. Gynaecol. Obstet. 87, 38–39 (2004).

    Article CAS Google Scholar

  3. Shuang, L. I. A. N. G. & Xiao-tian, L. I. Advances in gestational age estimating based on last menstrual period and ultrasound. J. Int. Obstet. Gynecol. 41, 448 (2014).

    Google Scholar

  4. Morales Rosello, J., Hervas Marin, D. & Perales Marin, A. Can we predict delivery date with obstetric ultrasound?. J. Maternal-Fetal Neonatal Med. 26, 1807–1811 (2013).

    Article Google Scholar

  5. Lawson, G. W. Naegele’s rule and the length of pregnancy: a review. Aust. N. Zeal. J. Obstet. Gynaecol. 61, 177–182 (2021).

    Article Google Scholar

  6. Hunter, L. A. Issues in pregnancy dating: revisiting the evidence. J. Midwifery Womens Health. 54, 184–190 (2009).

    Article Google Scholar

  7. Dietz, P. M. et al. A comparison of LMP-based and ultrasound-based estimates of gestational age using linked California livebirth and prenatal screening records. Paediat. Perinat. Epidemiol. 21, 62–71 (2007).

    Article Google Scholar

  8. Ruiz-Martinez, S. & Oros, D. Re: ISUOG Practice Guidelines on ultrasound assessment of fetal biometry and growth: time to pay attention to bias in Doppler studies. Ultrasound Obstet. Gynecol. Off. J. Int. Soc. Ultrasound Obstet. Gynecol. 54, 419 (2019).

    Article CAS Google Scholar

  9. Kessler, J. et al. Estimated date of delivery based on second trimester fetal head circumference: a population-based validation of 21451 deliveries. Acta Obstet. Gynecol. Scand. 98, 101–105 (2019).

    Article Google Scholar

  10. Majola, L. et al. Reliability of last menstrual period recall, an early ultrasound and a Smartphone App in predicting date of delivery and classification of preterm and post-term births. BMC Pregn. Childbirth. 21, 1–9 (2021).

    Article Google Scholar

  11. Goto, E. Ultrasound fetal anthropometry to identify large-for-gestational-age: a meta-analysis. Min. Ginecol. 71, 467–474 (2019).

    Google Scholar

  12. Liu, B. et al. Association between maternal pre-pregnancy obesity and preterm birth according to maternal age and race or ethnicity: a population-based study. Lancet Diab. Endocrinol. 7, 707–714 (2019).

    Article CAS Google Scholar

  13. Voerman, E. et al. Association of gestational weight gain with adverse maternal and infant outcomes. JAMA 321, 1702–1715 (2019).

    Article Google Scholar

  14. Staneva, A., Bogossian, F., Pritchard, M. & Wittkowski, A. The effects of maternal depression, anxiety, and perceived stress during pregnancy on preterm birth: a systematic review. Women Birth. 28, 179–193 (2015).

    Article Google Scholar

  15. Sagi-Dain, L. & Sagi, S. The effect of late pregnancy date fruit consumption on delivery progress–a meta-analysis. EXPLORE. 17, 569–573 (2021).

    Article Google Scholar

  16. Loy, S. L. et al. Maternal night-time eating and sleep duration in relation to length of gestation and preterm birth. Clin. Nutrit. 39, 1935–1942 (2020).

    Article Google Scholar

  17. Jaiswal, P., Masih, W. F., Jaiswal, S. & Chowdhary, D. S. Assessment of fetal gestational age by ultrasonic measurement of bi-parietal diameter in the southern part of Rajasthan. Med. J. Dr. DY Patil Univ. 8, 27 (2015).

    Article Google Scholar

  18. Mongelli, M., Wilcox, M. & Gardosi, J. Estimating the date of confinement: ultrasonographic biometry versus certain menstrual dates. Am. J. Obstet. Gynecol. 174, 278–281 (1996).

    Article CAS Google Scholar

  19. Ng, K. Y. B. & Steer, P. J. Prediction of an estimated delivery date should take into account both the length of a previous pregnancy and the interpregnancy interval. Eur. J. Obst. Gynecol. Reprod. Biol. 201, 101–107 (2016).

    Article Google Scholar

  20. Khambalia, A. Z. et al. Predicting date of birth and examining the best time to date a pregnancy. Int. J. Gynecol. Obstet. 123, 105–109 (2013).

    Article Google Scholar

  21. Liang, L. et al. Metabolic dynamics and prediction of gestational age and time to delivery in pregnant women. Cell 181, 1680–1692 (2020).

    Article CAS Google Scholar

  22. van den Heuvel, T. L. et al. Automated fetal head detection and circumference estimation from free-hand ultrasound sweeps using deep learning in resource-limited countries. Ultrasound Med. Biol. 45, 773–785 (2019).

    Article Google Scholar

  23. Fung, R. et al. Achieving accurate estimates of fetal gestational age and personalised predictions of fetal growth based on data from an international prospective cohort study: a population-based machine learning study. Lancet Dig. Health. 2, e368–e375 (2020).

    Article ADS Google Scholar

  24. Schink, T., Wentzell, N., Dathe, K., Onken, M. & Haug, U. Estimating the beginning of pregnancy in German claims data: development of an algorithm with a focus on the expected delivery date. Front. Public Health. 8, 350 (2020).

    Article Google Scholar

  25. Torres, M. T., Valstar, M., Henry, C., Ward, C. & Sharkey, D. Postnatal gestational age estimation of newborns using Small Sample Deep Learning. Image Vis. Comput. 83, 87–99 (2019).

    Article Google Scholar

  26. Kojita, Y. et al. Deep learning model for predicting gestational age after the first trimester using fetal MRI. Eur. Radiol. 31, 3775–3782 (2021).

    Article Google Scholar

  27. Friedman, J. H. Greedy function approximation: a gradient boosting machine. Ann. Stat. 29, 1189–1232 (2001).

    Article MathSciNet Google Scholar

  28. Huan, J., Li, H., Li, M. & Chen, B. Prediction of dissolved oxygen in aquaculture based on gradient boosting decision tree and long short-term memory network: a study of Chang Zhou fishery demonstration base, China. Comput. Elect. Agric. 175, 105530 (2020).

    Article Google Scholar

  29. Zhou, Z. H. Ensemble methods: foundations and algorithms (Chapman and Hall/CRC, 2019).

    Google Scholar

  30. Chung, J., Gulcehre, C., Cho, K. & Bengio, Y. Empirical evaluation of gated recurrent neural networks on sequence modeling. (2014).

  31. Hochreiter, S. & Schmidhuber, J. Long short-term memory. Neural Comput. 9, 1735–1780 (1997).

    Article CAS Google Scholar

  32. Medsker, L. R. & Jain, L. C. Recurrent neural networks. Des. Appl. 5, 64–67 (2001).

    Google Scholar

  33. Williams, R. J. & Zipser, D. A learning algorithm for continually running fully recurrent neural networks. Neural Comput. 1, 270–280 (1989).

    Article Google Scholar

  34. Che, Z., Purushotham, S., Cho, K., Sontag, D. & Liu, Y. Recurrent neural networks for multivariate time series with missing values. Sci. Rep. 8, 1–12 (2018).

    Article Google Scholar

  35. Breiman, L. Random forests. Mach. Learn. 45, 5–32 (2001).

    Article Google Scholar

  36. Svetnik, V. et al. Random forest: a classification and regression tool for compound classification and QSAR modeling. J. Chem. Inform. Comput. Sci. 43, 1947–1958 (2003).

    Article CAS Google Scholar

  37. Chang, C. C. & Lin, C. J. LIBSVM: a library for support vector machines. ACM Trans. Intell. Syst. Technol. (TIST) 2, 1–27 (2011).

    Article Google Scholar

  38. Yang, B. P., Wei, Z. H. O. U. & Li, L. I. Clinical analysis of the delivery date reckoned by traditional and modified formula for calculating the expected date of confinement. Med. J. Chin. Peoples Liberat. Army. 42, 230–233 (2017).

    Google Scholar

Download references

Acknowledgements

We also like to thank all pregnant women and healthcare professionals who participated in the different stage of the development of the prediction model.

Funding

This research was funded by Primary Research and Development Plan of Zhejiang Province in China (Grant No 2020C03107) and Natural Science Foundation of Zhejiang Province in China (Grant NO GF20F020009).

Author information

Author notes

  1. These authors contributed equally: Yina Wu and Yichao Zhang.

Authors and Affiliations

  1. Engineering Research Center of Mobile Health Management Ministry of Education, Hangzhou Normal University, Hangzhou, China

    Yina Wu,Yichao Zhang,Zhenming Yuan,Xiaoyan Sun&Yingfei Wu

  2. Hangzhou Hele Tech. Co, Hangzhou, China

    Xu Zou

  3. Hangzhou Women’s Hospital, Hangzhou, China

    Wensheng Hu&Sha Lu

Authors

  1. Yina Wu

    View author publications

    You can also search for this author in PubMedGoogle Scholar

  2. Yichao Zhang

    View author publications

    You can also search for this author in PubMedGoogle Scholar

  3. Xu Zou

    View author publications

    You can also search for this author in PubMedGoogle Scholar

  4. Zhenming Yuan

    View author publications

    You can also search for this author in PubMedGoogle Scholar

  5. Wensheng Hu

    View author publications

    You can also search for this author in PubMedGoogle Scholar

  6. Sha Lu

    View author publications

    You can also search for this author in PubMedGoogle Scholar

  7. Xiaoyan Sun

    View author publications

    You can also search for this author in PubMedGoogle Scholar

  8. Yingfei Wu

    View author publications

    You can also search for this author in PubMedGoogle Scholar

Contributions

Y.W., Y.Z. contributed equally to this work. Y.W. and Y.W. were responsible for the study design. Y.Z., S.L. and W.H. extracted the data. Y.W. completed the relevant experiments. Y.W., X.Z., Z.Y., W.H., S.L. and X.S. provided feedback on analyses and interpretation of results. Y.Z., Y.W., Y.W. wrote this paper. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Yingfei Wu.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher's note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Estimated date of delivery with electronic medical records by a hybrid GBDT-GRU model (6)

Cite this article

Wu, Y., Zhang, Y., Zou, X. et al. Estimated date of delivery with electronic medical records by a hybrid GBDT-GRU model. Sci Rep 12, 4892 (2022). https://doi.org/10.1038/s41598-022-08664-5

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1038/s41598-022-08664-5

Estimated date of delivery with electronic medical records by a hybrid GBDT-GRU model (2024)

FAQs

What is the EDD and actual delivery date? ›

The estimated due date (EDD or EDC) is the date that spontaneous onset of labor is expected to occur. The due date may be estimated by adding 280 days ( 9 months and 7 days) to the first day of the last menstrual period (LMP). This is the method used by "pregnancy wheels".

What is the calculation and importance of EDD? ›

Naegele's rule is a standard way of calculating the due date for a pregnancy when assuming a gestational age of 280 days at childbirth. The rule estimates the expected date of delivery (EDD) by adding a year, subtracting three months, and adding seven days to the origin of gestational age.

What is EDD in medical terms? ›

The estimated delivery date, also called the estimated due date or EDD, is 40 weeks or 280 days from the first day of the last menstrual period (LMP). Only 4 percent (1 in 20) of pregnant people will give birth on their due date.

How do you calculate estimated delivery date? ›

An estimated due date can be calculated by following steps 1 through 3: First, determine the first day of your last menstrual period. Next, count back 3 calendar months from that date. Lastly, add 1 year and 7 days to that date.

How accurate are EDD dates? ›

More than 90% are born two weeks either side of the predicted date. But, as noted above, only 4% (or 4.4%, ignoring pregnancies with complications etc) are born on the predicted date itself - in other words, the chance of this happening is less than one in 20.

Why do we add 7 days in the expected date of delivery? ›

Naegele's rule, derived from a German obstetrician, subtracts 3 months and adds 7 days to calculate the estimated due date (EDD). It is prudent for the obstetrician to get a detailed menstrual history, including duration, flow, previous menstrual periods, and hormonal contraceptives.

How does EDD calculate how much you get? ›

Your benefit amount is based on the quarter with your highest wages earned within the base period. A base period covers 12 months and is divided into four consecutive quarters. The base period includes wages subject to SDI tax that were paid about 5 to 18 months before your disability claim began.

What is the disadvantage of EDD rule? ›

The major disadvantage of the earliest due data is that it minimizes the maximum of jobs past due but doesn't perform average flow time. If a job is completed beyond its due date then it is called tardy job, otherwise it is called non tardy jobs.

How do doctors determine EDD? ›

Traditionally, determining the first day of the LMP is the first step in establishing the EDD. By convention, the EDD is 280 days after the first day of the LMP.

What week is it safe to give birth? ›

Babies who are born after 34 weeks gestation have the same long-term health outcomes as babies who are delivered at full term (40 weeks). This means that if your baby is born when they are 34 weeks old, they have the same chances of being healthy as any other baby that wasn't born prematurely.

What is EDD in simple words? ›

Customer due diligence (CDD) and enhanced due diligence (EDD) are two different types of know-your-customer (KYC) processes that firms to check their customers. EDD is a comprehensive framework comprising rules and procedures designed to address high-risk clients and significant financial transactions.

What is the actual delivery date? ›

Actual Delivery Date means the day when all defects, deficiencies or deviations from specified requirements have been rectified and an approved acceptance test has taken place and a certificate of acceptance has been signed.

Does EDD stand for estimated delivery date? ›

Online shoppers want to know when to expect their purchases. Estimated Delivery Dates (EDDs) are essential for setting expectations and building trust. However, inaccurate or overly broad EDDs can lead to frustration and a negative customer experience.

Who calculates the EDD or estimated date of delivery? ›

If you know you had your LMP during a particular week, your doctor can estimate your due date accordingly. If you have no idea when your last period was, your doctor may order an ultrasound to determine your due date.

What is the difference between due date and delivery date? ›

Due Date: The date when we expect the project is completed. Delivery Date: I dont' know becuase when I go to the timeline of the project I can see both dates (Start and Due), however I can't not see Delivery Date, so I don't know how to use this date and take benefit of it.

Top Articles
Latest Posts
Article information

Author: Patricia Veum II

Last Updated:

Views: 5706

Rating: 4.3 / 5 (44 voted)

Reviews: 83% of readers found this page helpful

Author information

Name: Patricia Veum II

Birthday: 1994-12-16

Address: 2064 Little Summit, Goldieton, MS 97651-0862

Phone: +6873952696715

Job: Principal Officer

Hobby: Rafting, Cabaret, Candle making, Jigsaw puzzles, Inline skating, Magic, Graffiti

Introduction: My name is Patricia Veum II, I am a vast, combative, smiling, famous, inexpensive, zealous, sparkling person who loves writing and wants to share my knowledge and understanding with you.