A new proportional hazard model with applications to breastfeeding data

Gómez, Yolanda M.; Caimanque, Wilson E.; Santibáñez, John L.; Zúñiga-Tapia, Rocío; Gallardo, Diego I.; Bourguignon, Marcelo

doi:10.1038/s41598-025-08219-4

Download PDF

Article
Open access
Published: 01 July 2025

A new proportional hazard model with applications to breastfeeding data

Yolanda M. Gómez¹,
Wilson E. Caimanque²,
John L. Santibáñez²,
Rocío Zúñiga-Tapia³,
Diego I. Gallardo¹ &
…
Marcelo Bourguignon⁴

Scientific Reports volume 15, Article number: 21869 (2025) Cite this article

141 Accesses
Metrics details

Subjects

Abstract

The World Health Organization recommends exclusive breastfeeding for the first 6 months of life, but several factors influence its cessation. Changes in breastfeeding behavior have been reported in the literature among infants at 0, 1, 3 and 5 months. The purpose of this paper is to introduce a new proportional hazards model based on the transformed-transformer method, employing the Weibull and piecewise exponential distributions as baseline models, with the aim of setting the partitions at the time when most weaning occurs. Some properties of the model, including monotonicity and quantile functions, are discussed in detail. Additionally, a regression model for censored data is presented. The parameters of the proposed distribution are estimated using the maximum likelihood approach. Some numerical illustrations of the estimators are presented with a discussion of the obtained results. Finally, two real applications related to breastfeeding in Brazil and the USA are provided to illustrate the usefulness of our proposal

Investigating the impact of breastfeeding difficulties on maternal mental health

Article Open access 19 April 2025

Breastfeeding is associated with reduction in postpartum depression in the United Arab Emirates: a retrospective cross-sectional study

Article Open access 27 March 2025

Breastfeeding competency and its influencing factors among pregnant women in third trimester pregnancy: a cross-sectional study

Article Open access 23 January 2023

Introduction

Breastfeeding is feeding a mother’s breast milk to her infant directly from the breast or by expressing (or pumping) the milk and bottle-feeding it to the infant. This method provides the ideal nutrition for infants and significantly contributes to their health, growth, and development. Additionally, breastfeeding benefits the mother’s health and fosters a strong emotional bond between mother and baby¹. The World Health Organization recommends exclusive breastfeeding for the first six months of life, followed by continued breastfeeding and appropriate complementary foods for up to two years or beyond. Breastfeeding data is essential for several reasons:

1.
Infant health: breastfeeding data helps understand infants’ health benefits. Breastfed babies have lower risks of various conditions, including asthma, obesity, type 1 diabetes, severe lower respiratory diseases, acute otitis media (ear infections), sudden infant death syndrome (SIDS), gastrointestinal infections (such as diarrhea and vomiting), necrotizing enterocolitis (NEC) in preterm infants, malnutrition, and nephrotic syndrome^1,2,3,4,5.
2.
Maternal health: breastfeeding can also provide health benefits for mothers, including a reduced risk of hypertension, type 2 diabetes, ovarian cancer, and breast cancer².
3.
Health care costs: Breastfeeding data can help understand and reduce health care costs. For example, if 90% of families breastfed exclusively for six months, the United States would save $2.2 billion per year on medical costs^6,7.
4.
Public health policy: breastfeeding data shapes public health policies. It helps develop interventions and programs that promote breastfeeding, especially in communities with low breastfeeding rates⁸.
5.
Research: breastfeeding data is also essential for research purposes. It can be used to study the effects of breastfeeding on various health outcomes and to identify factors that influence breastfeeding initiation and duration⁹.
6.
Education and support: breastfeeding data can help to identify areas where more education and support are needed to encourage breastfeeding¹⁰.

Several factors influence the cessation of exclusive breastfeeding, including difficulties with breastfeeding, concerns about infant weight gain, illnesses affecting mothers or infants, the need for medication, misperceptions about milk supply, younger maternal age, lower levels of maternal education, unplanned pregnancies, employment outside the home, and a lack of emotional support, among others^{1,5,11,12,13,14,15}.

Exclusive breastfeeding rates widely vary beyond six months. For instance, they are only 2% in Bulgaria, 3.7% in Poland, and 1% in Greece, Finland, and the United Kingdom. In contrast, some countries meet international recommendations, including Kyrgyzstan at 56.1%, Georgia at 54.8%, Croatia at 52.4%, Slovakia at 49%, and Hungary at 44%¹⁶.

However, the literature has documented that there are critical periods for breastfeeding. Changes in breastfeeding behavior among infants at 0, 1, 3.5, and 5 months. These periods must be incorporated into the analysis of the data in some way. Many babies stopped breastfeeding for various reasons. This aligns with findings by Gianni et al.¹¹, which indicated that 63% of Spanish mothers surveyed reported experiencing difficulties within the first month postpartum¹². Similarly, a study on Taiwanese women showed that only 29.3% maintained exclusive breastfeeding before the third month¹³. Furthermore, only 38.4% of Saudi Arabian women breastfed exclusively for six months or more, with many ceasing prior to that mark¹⁴. Furthermore, because lactation studies often cannot be considered over such a long period, it is common for this type of study to end before all mothers in the study have weaned, making some of the data subject to right-censoring. In summary, breastfeeding data is crucial for monitoring the health of infants and mothers, informing public health policy, guiding research, and supporting breastfeeding education and promotion efforts; in this article, we will focus on point 5.

Thus, the survival analysis is an ad hoc tool to analyze data related to the time of the occurrence until an event of interest, typically censored. In this line, many distributions have been used in the literature. A key distribution used in this context is the Weibull (W) model, both because of the simplicity of its survival and risk functions and because its risk function can be monotonic (increasing, decreasing, or constant) depending only on the shape parameter. The corresponding cumulative distribution (cdf) and probability density function (pdf) for $T\sim \text{ W }(\gamma ,\alpha )$ are defined by

$$\begin{aligned} g(t;\alpha ,\gamma ) = \frac{\alpha t^{\alpha -1}}{\gamma ^{\alpha }}\exp {\left\{ -\left( \frac{t}{\gamma }\right) ^\alpha \right\} }\quad \text{ and } \quad G(t) = 1-\exp {\left\{ -\left( \frac{t}{\gamma }\right) ^\alpha \right\} }, \end{aligned}$$

respectively and $t>0$, $\alpha ,\gamma >0$. Other models that appear recurrently in the literature (and especially in classical books, for example,¹⁷) associated with survival analysis are the log-normal and log-logistic models. However, the log-normal model has a non-monotone hazard function, while the logistic model can only assume decreasing monotone hazard functions. The problem is that all of these models do not allow for breaking points to be incorporated into the analysis, as discussed above for our weaning problem. An interesting alternative to these models, and considering this issue proposed more than 50 years ago, is the piecewise exponential (PE) model proposed by Feigl and Zelen¹⁸. Similar to the exponential model, this distribution has the characteristic of having a constant hazard function, not constant in $(0, \infty )$, but rather constant in each of L defined intervals. Specifically, its hazard function is given by

$$\begin{aligned} h(t; {\varvec{\lambda }}\mid {\varvec{a}})=\lambda _l, \quad \text {for} \ t \in [a_{l-1},a_l), \quad l=1,\ldots ,L, \end{aligned}$$

(1)

where $0=a_0<a_1<\cdots<a_{L-1} < a_L=\infty$. The corresponding cdf and pdf are given by

$$\begin{aligned} R(t; {\varvec{\lambda }} \mid {\varvec{a}})&=1-\exp \left( -\sum _{l=1}^L \lambda _l \nabla _l(t)\right) , \quad \text{ and } \end{aligned}$$

(2)

$$\begin{aligned} r(t; {\varvec{\lambda }} \mid {\varvec{a}})&=\kappa _l \lambda _l \exp (-\lambda _l (t-a_{l-1})), \quad \text {for} \ t \in [a_{l-1},a_l), \quad l=1,\ldots ,L, \end{aligned}$$

(3)

respectively, with

$$\begin{aligned} \kappa _l= \left\{ \begin{array}{ll} 1 & \text{, } \text{ if } l=1, \\ \exp \left( -\displaystyle \sum _{i=1}^{l-1} \lambda _i(a_i-a_{i-1})\right) & \text{, } \text{ if } l=2,\ldots ,L, \end{array} \right. \end{aligned}$$

and

$$\begin{aligned} \nabla _l(t)= \left\{ \begin{array}{lll} 0 & \text {,~if } t<a_{l-1}, \\ t-a_{l-1} & \text {,~if } a_{l-1}\le t<a_l, ~~~~l=1, \ldots , L,\\ a_l-a_{l-1} & \text {,~if } t \ge a_l. \end{array} \right. \end{aligned}$$

We use the notation $T\sim PE({\varvec{\lambda }},{\varvec{a}})$ to denote the PE distribution with vector of parameters ${\varvec{\lambda }}=(\lambda _1,\ldots ,\lambda _L)^\top$ ($\top$ indicates the transpose) and time partition ${\varvec{a}}= (a_0,a_1,\ldots ,a_L)^\top$. The case with covariates was introduced in Friedman¹⁹. One of the limitations of the PE model is that its hazard function is constant in each interval. An extension of the PE model solving this problem, using the power model²⁰, was developed by Gómez et al.²¹ and extended subsequently for the interval-censored case in dos Santos and Schneider²². Along these lines, we seek to propose new models that also allow these breakpoints to be incorporated into the analysis and that are more flexible regarding the hazard function within each interval. This can be developed by creating a new family of distribution following the idea of Alzaatreh et. al.²³. Different models can be found in the literature using this family distribution^24,25. For our purpose, the Weibull distribution is going to be used, and as a baseline, the PE distribution, then a new distribution is going to be developed in the next sections.

The method proposed by Alzaatreh et al.²³, adapted to our context of positive random variables, can be summarized as following. Let T be a random variable with pdf and cdf r(t) and R(t), respectively, and X be a continuous random variable with pdf g(x) defined on $(0,\infty )$. Also, let W(R(t)) be a function of the cdf R(t) of any random variable X so that W(R(t)) satisfies the following conditions:

a)
$W(R(t)) \in (0,\infty )$, $\forall t>0$;
b)
W(R(t)) is differentiable and monotonically non-decreasing, $\forall t>0$;
c)
$W(R(t)) \rightarrow 0$ as $t \rightarrow 0$ and $W(R(t)) \rightarrow \infty$ as $t \rightarrow \infty$.

The cdf of a new family of distribution is defined as

$$\begin{aligned} F(t) = \int _{0}^{W(R(t))} g(x) dx = G(W(R(t))). \end{aligned}$$

(4)

The pdf related to Eq. (4) is

$$\begin{aligned} f(t) = \left[ \frac{d}{dt} W(R(t))\right] r(t) g[W(R(t))]. \end{aligned}$$

Motivated by the breastfeeding problem, we propose to consider the PE distribution for r(t) and R(t), henceforth denotes by $r(t;{\varvec{\lambda }}\mid {\varvec{a}})$ and $R(t;{\varvec{\lambda }}\mid {\varvec{a}})$, respectively, providing new classes of piecewise models.

The remainder of this paper is organized as follows. In “The new models”, the new model is discussed, and the properties of the new distribution are presented. Estimation using the maximum likelihood (ML) method, the inference used for this model is presented in “Inference”. A simulation study is presented in “Simulation study”, and real-data applications are reported in “Real data applications”. Finally, concluding remarks are given in “Conclusions”.

The new models

For the first time in the literature, we propose to investigate piecewise extensions for the T-X family of distributions considering $T \sim \text{ PE }({\varvec{\lambda }},{\varvec{a}})$ and $X \sim \text{ W }(1, \alpha )$ (Bourguignon et al²⁶), i.e., the Weibull distribution with scale and shape parameters 1 and $\alpha$, respectively. We will focus on this combination of distributions because they provide a piecewise version of the Weibull model, which is a model that has been subject of many studies (see, e.g., Alzaatreh and Ghosh,²⁴; Thair et al.²⁵). We will name this models as Weibull piecewise exponential (WPE) distributions. However, depending as the choice of $W(\cdot )$, we propose two cases which are detailed below.

Case 1: $W(R(t)) = \dfrac{R(t)}{1-R(t)}$

In this case, the cdf of the new model is

$$\begin{aligned} F(t; \varvec{\lambda }, \alpha \, |\, \varvec{a}) = 1- \exp \left\{ -\left( \frac{R(t; \varvec{\lambda }\, |\, \varvec{a})}{1- R(t; \varvec{\lambda }\, |\, \varvec{a})} \right) ^{\alpha } \right\} , \quad t>0, \end{aligned}$$

(5)

where $R(\cdot ; \varvec{\lambda }\, |\, \varvec{a})$ is the cdf for the PE model described in Eq. (2). We will say $T\sim \text{ WPE1 }(\varvec{\lambda }, \alpha )$ to refer to a random variable with cdf as in (5). The corresponding pdf is given by

$$\begin{aligned} f(t; \varvec{\lambda }, \alpha \, |\, \varvec{a}) = \frac{\alpha r(t; \varvec{\lambda }\, |\, \varvec{a}) \bigl [R(t; \varvec{\lambda }\, |\, \varvec{a}) \bigr ]^{\alpha -1}}{\bigl [1 -R(t; \varvec{\lambda }\, |\, \varvec{a}) \bigr ]^{\alpha +1 }} \, \exp \left\{ - \left( \frac{R(t; \varvec{\lambda }\, |\, \varvec{a})}{1 - R(t; \varvec{\lambda }\, |\, \varvec{a})} \right) ^{\alpha } \right\} , \quad t>0. \end{aligned}$$

The quantile function, Q(p), $0< p < 1$, is given by

$$\begin{aligned} Q(p) = R^{-1} \biggl (\frac{[-\log (1 - p)]^{{1}/{\alpha }}}{1+[-\log (1 - p)]^{{1}/{\alpha }}};\,\ \varvec{\lambda }\,|\,\varvec{a} \biggr ). \end{aligned}$$

The hazard function can be expressed as,

$$\begin{aligned} h(t; \varvec{\lambda }, \alpha \, |\, \varvec{a}) = \frac{\alpha \, r(t; \varvec{\lambda }\, |\, \varvec{a}) \bigl [R(t; \varvec{\lambda }\, |\, \varvec{a})\bigr ]^{\alpha -1}}{ \bigl [1 -R(t; \varvec{\lambda }\, |\, \varvec{a})\bigr ]^{\alpha +1 }}, \quad t>0. \end{aligned}$$

(6)

The behavior of the hazard function for the WPE1 model is described in the next proposition.

Proposition 1

The hazard function for the WPE1 model is increasing, constant, or decreasing within each interval defined by the partition time for $\alpha >1$, $\alpha =1$, and $\alpha <1$, respectively, except at most in an interval, where it will be non-monotonic.

Proof

The derivative of the logarithm in Eq. (6) is given by

$$\begin{aligned} \frac{\partial \log h(t; \varvec{\lambda }, \alpha \, |\, \varvec{a})}{\partial t}=\frac{r'(t; \varvec{\lambda }\, |\, \varvec{a})}{r(t; \varvec{\lambda }\, |\, \varvec{a})}+r(t; \varvec{\lambda }\, |\, \varvec{a})\left[ \frac{\alpha -1+2R(t; \varvec{\lambda }\, |\, \varvec{a})}{R(t; \varvec{\lambda }\, |\, \varvec{a})[1-R(t; \varvec{\lambda }\, |\, \varvec{a})]}\right] , \end{aligned}$$

where $r'(t; \varvec{\lambda }\, |\, \varvec{a})=\partial r(t; \varvec{\lambda }\, |\, \varvec{a})/\partial t=-\kappa _l \lambda _l^2 \exp (-\lambda _l(t-a_{l-1}))I^{(t)}_{[a_{l-1},a_l)}$ and $I_{A}^{(t)}$ denotes the indicator function. Therefore, $r'(t; \varvec{\lambda }\, |\, \varvec{a})/r(t; \varvec{\lambda }\, |\, \varvec{a})=-\lambda _l I^{(t)}_{[a_{l-1},a_l)}$, for $l=1,\dots ,L$. In addition, if $t\in [a_{l-1},a_l)$ we also can write $r(t; \varvec{\lambda }\, |\, \varvec{a})=\lambda _l C_{l}\exp (-\lambda _l t)$ and $R(t; \varvec{\lambda }\, |\, \varvec{a})=1-C_{l}\exp (-\lambda _l t)$, with $C_{l}=\kappa _l \exp (\lambda _l a_{l-1})$. With those notations and considering $t \in [a_{l-1},a_l)$, we obtain

$$\begin{aligned} \frac{\partial \log h(t; \varvec{\lambda }, \alpha \, |\, \varvec{a})}{\partial t}=-\lambda _l+\frac{\lambda _l [\alpha -1+2(1-C_{l}e^{-\lambda _l t})]}{(1-C_{l}e^{-\lambda _l t})}. \end{aligned}$$

Therefore, the solution for $\partial \log h(t; \varvec{\lambda }, \alpha \, |\, \varvec{a})/\partial t=0$ is given by $t_l=-\frac{1}{\lambda _l}\log \left( \frac{\alpha }{C_l}\right)$. Note that $t_l \in [a_{l-1},a_l)$, if $\kappa _l\exp (-\lambda _l(a_l-a_{l-1})) < \alpha \le \kappa _l$. As $\kappa _1<\kappa _2<\ldots <\kappa _L$, then: i) for a single $l=1,\ldots ,L$, it can be fulfilled that if $\kappa _l\exp (-\lambda _l(a_l-a_{l-1})) < \alpha \le \kappa _l$ or; ii) for no l is it true that if $\kappa _l\exp (-\lambda _l(a_l-a_{l-1})) < \alpha \le \kappa _l$. The first case implies that hazard function is monotonic within all intervals, except in one, whereas the second case implies that the hazard function is monotonic within all the intervals. $\square$

Figure 1 shows the hazard function for different time partitions. Note that when $\alpha <1$, the function is decreasing in both partitions, and if $\alpha > 1$, the function has different behavior because in the first time partition, the function is increasing, and in the second partition, it is a parabola. It is possible to note that in the first time partition, the function is decreasing and in the second time partition, the function is increasing. ${\varvec{\lambda }}$ value plays the role of scale parameter. The figures were made with the statistical software R²⁷.

Case 2: $W(R(t))=-\log \left( 1-R(t)\right)$

In this case, the cdf of the new model is

$$\begin{aligned} F(t; \varvec{\lambda }, \alpha \, |\, \varvec{a}) = 1 - \exp \biggl \{-\Bigl (- \log \bigl [1 - R(t; \varvec{\lambda }\, |\, \varvec{a}) \bigr ] \Bigr )^\alpha \biggr \}, \quad t > 0, \end{aligned}$$

(7)

and the pdf is

$$\begin{aligned} f(t; \varvec{\lambda }, \alpha \, |\, \varvec{a}) = \frac{\alpha r(t; \varvec{\lambda }\, |\, \varvec{a}) }{ 1- R(t; \varvec{\lambda }\, |\, \varvec{a})} \exp \biggl \{-\Bigl (- \log \bigl [1 - R(t; \varvec{ \lambda }\, |\, \varvec{a}) \bigr ] \Bigr )^\alpha \biggr \} \Bigl ( -\log \bigl [1 - R(t; \varvec{\lambda }\, |\, \varvec{a}) \bigr ] \Bigr )^{\alpha -1}. \end{aligned}$$

Henceforth, we will refer to this model as WPE2$(\varvec{\lambda , a}, \alpha )$. Then, the survival and hazard function are given by,

$$\begin{aligned} S(t; \varvec{\lambda }, \alpha \, |\, \varvec{a})= \exp \biggl \{-\Bigl (- \log \bigl [1 - R(t; \varvec{\lambda }\, |\, \varvec{a}) \bigr ] \Bigr )^\alpha \biggr \}, \end{aligned}$$

and

$$\begin{aligned} h(t; \varvec{\lambda }, \alpha \, |\, \varvec{a}) = \frac{\alpha r(t; \varvec{\lambda }\, |\, \varvec{a})}{1 - R(t; \varvec{\lambda }\, |\, \varvec{a})} \biggl ( -\log \bigl [1 - R(t; \varvec{\lambda }\, |\, \varvec{a}) \bigr ] \biggr )^{\alpha -1}, \end{aligned}$$

where $R(\cdot )$ and $r(\cdot )$ are the pdf and cdf of the PE model defined in Eqs. (2) and (3), respectively. Figure 2 illustrates the flexibility of the WPE2 model. The parameter $\alpha$ has a similar form in $L=2$ and $L=3$. In the case of $\alpha < 1$, the hazard function is decreasing, and for $\alpha > 1$ the function is increasing. When $\alpha$ is too small, the hazard function decreases faster. The different pattern for the hazard function of the WPE2 model is presented in the following proposition.

Proposition 2

The hazard function for the WPE2 model is increasing, constant, or decreasing within each interval defined by the partition time for $\alpha >1$, $\alpha =1$, and $\alpha <1$, respectively.

Proof

As $r(t; \varvec{\lambda }\, |\, \varvec{a})/[1-R(t; \varvec{\lambda }\, |\, \varvec{a})]$ is constant between each partition time, it follows that

$$\begin{aligned} \frac{\partial \log h(t; \varvec{\lambda }, \alpha \, |\, \varvec{a})}{\partial t}=-\frac{(\alpha -1)r(t; \varvec{\lambda }\, |\, \varvec{a})}{(1-R(t; \varvec{\lambda }\, |\, \varvec{a}))\log (1-R(t; \varvec{\lambda }\, |\, \varvec{a}))}. \end{aligned}$$

Note that $\partial \log h(t; \varvec{\lambda }, \alpha \, |\, \varvec{a})/\partial t=0$ has no solution for $t>0$. It follows directly the result. $\square$

Some particular cases of the WPE2 model are:

WPE2 reduce to PE model when $\alpha =1$
WPE2 reduce to Exponential model when $\alpha =1$ and $L=1$

The quantile function is

$$\begin{aligned} Q(p) = R^{-1} \Bigl ( 1 - \exp \Bigl \{ - \left[ -\log (1 - p)\right] ^{{1}/{\alpha }}\Bigr \} ;\,\varvec{\lambda }\,|\,\varvec{a}\Bigr ), \qquad 0<p<1 \end{aligned}$$

Proposition 3

If $T \sim \text {WPE2}(\varvec{\lambda , a}, \alpha )$, then the n-th moments is,

$$\begin{aligned} \begin{aligned} E[T^n]&= \sum _{\ell =1}^L \lambda _{\ell }^{-1} \sum _{k=0}^\infty \left( {\begin{array}{c}n\\ k\end{array}}\right) A^{n-k} \biggl \{ \Gamma \biggl [ \frac{k}{\alpha } + 1, \biggl (\sum _{d=1}^L \lambda _d (a_d - a_{d-1})+\lambda _{\ell }(a_\ell - a_{\ell - 1})\biggr )^{\alpha } \biggr ] \\&- \Gamma \biggl [ \frac{k}{\alpha } + 1, \biggl (\sum _{d=1}^L \lambda _d (a_d - a_{d-1}))\biggr )^{\alpha } \biggr \}. \end{aligned} \end{aligned}$$

Proof

By definition and using the alternative expression for the pdf we obtain

$$\begin{aligned} E[T^n]&= \int _{0}^{\infty } t^n r(t; \varvec{\lambda , a}, \alpha ) dt \\&= \alpha \sum _{\ell =1}^L \lambda _\ell \int _{a_{l-1}}^{a_l} t^n \biggl ( \sum ^{L}_{\ell =1} \lambda _\ell \nabla _{l}(t) \biggr )^{\alpha -1} \exp \left\{ -\biggl ( \sum ^{L}_{d=1} \lambda _d \nabla _{d}(t) \biggr )^{\alpha } \right\} dt. \end{aligned}$$

Using the definition of $\nabla _{\ell }$ in Eq. (1), putting $u = w^\alpha$, with $w = I_{\ell >1} \sum _{d=1}^L \lambda _d (a_d - a_{d-1}) + \lambda _{\ell } (t - a_{\ell - 1})$ and using the identity $(1+b)^c=\sum _{k=0}^{\infty }\left( {\begin{array}{c}c\\ k\end{array}}\right) b^k$, the result is obtained. $\square$

The non-homogeneous case

The Cox proportional hazards (PH) model approach²⁸ was considered to incorporate predictor variables. Cox assumed that the hazard ratio between any two observations is constant over time, according to the relationship

$$\begin{aligned} h(t\,|\,{\varvec{x}}_i) = h_0(t)\exp {({\varvec{x}}_i^{\top }{\varvec{\beta }})}, \quad t>0, \end{aligned}$$

where $h_0(\cdot )$ is called the baseline or reference risk, ${\varvec{x}}_i= (x_{i1}, \ldots , x_{ip})$ is a vector of covariates for the i-th individual and ${\varvec{\beta }}=(\beta _1,\ldots ,\beta _p)$ is the vector of coefficients related to the p observed covariates (both without intercept term). In this context, $\exp (\beta _j)$ represents the increase or decrease in the hazard function if all covariates were fixed and the j-th covariate was increased in one unit. Based on this scheme, the survival function becomes

$$\begin{aligned} S(t\,|\,{\varvec{x}}_i) = S_0(t)^{\exp {({\varvec{x}}_i^{\top }{\varvec{\beta }})}}, \quad t>0, \end{aligned}$$

with $S_0(\cdot )$ being the baseline survival function corresponding to $h_0(\cdot )$. Taking into account our initial problem of considering the critical months for modeling the breastfeeding problem, we propose to explore the use of the WPE1 and WPE2 distributions in this context.

Inference

In this section, we study the ML estimation for the parameters of the WPE1 and WPE2 for the right censored problem.

ML estimation

For this problem and considering that the data are subject to right censoring, the observed data for the i-th individual can be represented by $t_i=\min (t_i^*,c_i)$, $\delta _i=I(t_i^*\le c_i)$ and the vector of covariates $\textbf{x}_{i}$, where $t_i^*$ and $c_i$ denotes the failure and censoring times, respectively, and $I(\cdot )$ denotes the indicator function. Under a right censoring scheme, the corresponding log-likelihood function for ${\varvec{\theta }}=({\varvec{\beta }}^\top ,{\varvec{\lambda }}^\top , \alpha )$ (the vector of parameters) under the assumption of non-informative censoring is expressed as

$$\begin{aligned} \ell ({\varvec{\theta }})= \sum _{i=1}^n \left[ \delta _i \left\{ \log h_{0}(t_i\mid {\varvec{\lambda }},\alpha )+{\varvec{x}}_i^{\top }{\varvec{\beta }}\right\} +\exp {({\varvec{x}}_i^{\top }{\varvec{\beta }})}\log S_0(t_i\mid {\varvec{\lambda }},\alpha )\right] . \end{aligned}$$

(8)

The ML estimator of ${\varvec{\theta }}$ is obtained maximizing $\ell ({\varvec{\theta }})$ in relation to ${\varvec{\theta }}$. This is equivalent to solve the equations

$$\begin{aligned} \frac{\partial \ell ({\varvec{\theta }})}{\partial {\varvec{\beta }}}=0, \quad \frac{\partial \ell ({\varvec{\theta }})}{\partial {\varvec{\lambda }}}=0 \quad \text{ and } \quad \frac{\partial \ell ({\varvec{\theta }})}{\partial \alpha }=0. \end{aligned}$$

(9)

Note that

$$\begin{aligned} \frac{\partial \ell ({\varvec{\theta }})}{\partial {\varvec{\beta }}}=\sum _{i=1}^n {\varvec{x}}_i^{\top }\left[ \delta _i+\exp ({\varvec{x}}_i^{\top } {\varvec{\beta }})\log S_0(t_i\mid {\varvec{\lambda }}, \alpha )\right] . \end{aligned}$$

On the other hand, for the WPE1 model

$$\begin{aligned} \frac{\partial \ell ({\varvec{\theta }})}{\partial \lambda _\ell }&= \sum _{i=1}^n \bigg \{\delta _i \left[ \frac{1}{\lambda _\ell }- (t_i-a_{\ell -1})I(a_{\ell -1}\le t_i < a_\ell )+(\alpha -1)\frac{[1-R(t_i\mid {\varvec{\lambda }})]}{R(t_i\mid {\varvec{\lambda }})}\nabla _\ell (t_i)-(\alpha +1)\nabla _\ell (t_i)\right] \\&~~~~~~~~~~~~~-\alpha \exp {({\varvec{x}}_i^{\top }{\varvec{\beta }})}\left( \frac{R(t_i\mid {\varvec{\lambda }})}{1-R(t_i\mid {\varvec{\lambda }})}\right) ^{\alpha -1}\frac{\nabla _\ell (t_i)}{[1-R(t_i\mid {\varvec{\lambda }})]}\bigg \}, \quad \ell =1,\ldots ,L,\\ \frac{\partial \ell ({\varvec{\theta }})}{\partial \alpha }&=\sum _{i=1}^n \left\{ \delta _i\left[ \frac{1}{\alpha }+ \log \left( \frac{R(t_i\mid {\varvec{\lambda }})}{1- R(t_i\mid {\varvec{\lambda }})}\right) \right] -\exp {({\varvec{x}}_i^{\top }{\varvec{\beta }})}\left( \frac{R(t_i\mid {\varvec{\lambda }})}{1-R(t_i\mid {\varvec{\lambda }})}\right) ^{\alpha }\log \left( \frac{R(t_i\mid {\varvec{\lambda }})}{1-R(t_i\mid {\varvec{\lambda }})}\right) \right\} , \end{aligned}$$

whereas for the WPE2 model

$$\begin{aligned} \frac{\partial \ell ({\varvec{\theta }})}{\partial \lambda _\ell }&= \sum _{i=1}^n \bigg \{\delta _i \left[ \frac{1}{\lambda _\ell }- (t_i-a_{\ell -1})I(a_{\ell -1}\le t_i < a_\ell )+\nabla _\ell (t_i)+(\alpha -1)\frac{\nabla _\ell (t_i)}{\sum _{l=1}^L \nabla _l(t_i)}\right] \\&~~~~~~~~~~~~~-\alpha \exp {({\varvec{x}}_i^{\top }{\varvec{\beta }})}\left( \sum _{l=1}^L \lambda _l \nabla _l(t_i)\right) ^{\alpha -1}\nabla _\ell (t_i)\bigg \}, \quad \ell =1,\ldots ,L,\\ \frac{\partial \ell ({\varvec{\theta }})}{\partial \alpha }&=\sum _{i=1}^n \left\{ \delta _i\left[ \frac{1}{\alpha }+\log \sum _{l=1}^L \lambda _l \nabla _l(t_i)\right] -\exp {({\varvec{x}}_i^{\top }{\varvec{\beta }})}\left( \sum _{l=1}^L \lambda _l \nabla _l(t_i)\right) ^\alpha \log \left( \sum _{l=1}^L \lambda _l \nabla _l(t_i)\right) \right\} . \end{aligned}$$

Equation (9) should be solved numerically using some optimization algorithm, such as Newton-Raphson and quasi-Newton²⁹. The R software²⁷ was used for this purpose. Specifically, the optim function was used for solve this optimization problem. Under mild regularity conditions (Cox and Hinley³⁰) the asymptotic distribution of the ML estimator $\widehat{\varvec{\theta }}$ is

$$\begin{aligned} \sqrt{n}\left[ \mathscr {I}({\varvec{\theta }})\right] ^{-1}\left( \widehat{\varvec{\theta }}-{\varvec{\theta }}\right) {\mathop {\longrightarrow }\limits ^{\mathscr {D}}} N_{p+L+1}({\varvec{0}}_{p+L+1}, {\varvec{I}}_{p+L+1}), \quad \text{ as } n\rightarrow \infty , \end{aligned}$$

(10)

where $N_{r}({\varvec{\mu }},{\varvec{\Sigma }})$ denotes the r-variate normal distribution with mean ${\varvec{\mu }}$ and variance ${\varvec{\Sigma }}$, ${\varvec{0}}_r$ denotes a vector of zeros with dimension r, ${\varvec{I}}_r$ denotes the identity matrix with dimension r and $\mathscr {I}({\varvec{\theta }})$ denotes the Fisher information matrix, which can be consistently estimated by $\widehat{\mathscr {I}}(\widehat{\varvec{\theta }})$, the hessian matrix related to the log-likelihood function in Eq. (8) evaluated at $\widehat{\varvec{\theta }}$. In addition, approximated confidence intervals for the components of ${\varvec{\theta }}$ can be considered as

$$\begin{aligned} \left[ \widehat{\mathscr {I}}(\widehat{\varvec{\theta }})\right] ^{-1}=\left( \begin{array}{ccc} \widehat{\mathscr {I}}^{\widehat{\varvec{\beta }}\widehat{\varvec{\beta }}} & \widehat{\mathscr {I}}^{\widehat{\varvec{\beta }}\widehat{\varvec{\lambda }}} & \widehat{\mathscr {I}}^{\widehat{\varvec{\beta }}\widehat{\alpha }} \\ \cdot & \widehat{\mathscr {I}}^{\widehat{\varvec{\lambda }}\widehat{\varvec{\lambda }}} & \widehat{\mathscr {I}}^{\widehat{\varvec{\lambda }}\widehat{ \alpha }} \\ \cdot & \cdot & \widehat{\mathscr {I}}^{\widehat{\alpha }\widehat{\alpha }} \\ \end{array} \right) . \end{aligned}$$

Then, the approximated $100(1-\nu )\%$ confidence intervals for $\beta _r$, with $\nu \in (0,1)$ and $r=1,\ldots ,p$ is given by

$$\begin{aligned} \left( \widehat{\beta }_r - z_{1-\nu /2}\sqrt{\widehat{\mathscr {I}}^{\widehat{\varvec{\beta }}\widehat{\varvec{\beta }}}_{rr}},\widehat{\beta }_r + z_{1-\nu /2}\sqrt{\widehat{\mathscr {I}}^{\widehat{\varvec{\beta }}\widehat{\varvec{\beta }}}_{rr}}\right) , \end{aligned}$$

where $z_u$ denotes the u-th quantile of the standard normal distribution, $\widehat{\mathscr {I}}^{\widehat{\varvec{\beta }}\widehat{\varvec{\beta }}}_{rr}$ denotes the r-th element of the diagonal of $\widehat{\mathscr {I}}^{\widehat{\varvec{\beta }}\widehat{\varvec{\beta }}}$. In a similar way, the approximated $100(1-\nu )\%$ confidence intervals for $\lambda _\ell$, $\ell =1,\ldots ,L$ and $\alpha$ are

$$\begin{aligned} \left( \widehat{\lambda }_\ell - z_{1-\nu /2}\sqrt{\widehat{\mathscr {I}}^{\widehat{\varvec{\lambda }}\widehat{\varvec{\lambda }}}_{\ell \ell }},\widehat{\lambda }_\ell + z_{1-\nu /2}\sqrt{\widehat{\mathscr {I}}^{\widehat{\varvec{\lambda }}\widehat{\varvec{\lambda }}}_{\ell \ell }}\right) \quad \text{ and } \quad \left( \widehat{\alpha } - z_{1-\nu /2}\sqrt{\widehat{\mathscr {I}}^{\widehat{\alpha }\widehat{ \alpha }}},\widehat{\alpha } + z_{1-\nu /2}\sqrt{\widehat{\mathscr {I}}^{\widehat{\alpha }\widehat{ \alpha }}}\right) , \end{aligned}$$

respectively.

Simulation study

In this section, we present a simulation study in order to assess the performance of the ML estimators for the WPE2 model under a scenario similar to application 1, which was carried out with the statistical software R²⁷. The algorithm used to generate censored samples for the $\text {WPE2}$ regression model (with $100\%C$ censoring of observations) is the following:

For $i = 1,\ldots , n$, draw the covariates vector $\textbf{x}_i$, where $\textbf{x}_i=(x_{1i},x_{2i},x_{3i},x_{4i})$ is a random sample from $X_1\sim Bern(63/150)$, $X_2\sim Bern(70/150)$, $X_3\sim Bern(66/150)$ and $X_4\sim Bern(56/150)$.
Draw $\displaystyle {U_i \sim U(0,1) }$.
Compute $W_i^{(T)} = \displaystyle {1 - \exp \left\{ - \left( \frac{- \log (U_i)}{\textbf{x}_i^{\top } {\varvec{\beta }}} \right) ^{1/\alpha } \right\} }$ and $W_i^{(C)} = \displaystyle {1 - \exp \left\{ - \left( \frac{- \log (C)}{ \textbf{x}_i^{\top } {\varvec{\beta }}} \right) ^{1/\alpha } \right\} }$.
Compute $Y_i = R^{-1}\left( W_i^{(T)}; {\varvec{\lambda }} \mid \varvec{a} \right) .$ (the failure times)
Compute $C_i=R^{-1}\left( W_i^{(C)}; {\varvec{\lambda }}\mid \varvec{a} \right) .$ (the censoring times)
Compute $T_i=\min (Y_i,C_i)$ and $\delta _i=I(Y_i \le C_i)$, the observed times and failure indicators.

Similar to the scheme of the first application, we assumed $L=4$ partitions and the partition time as ${\varvec{a}}= (0,0.08,0.29,0.41)$. The parameters used were $\varvec{\beta } = (\beta _1, \beta _2,\beta _3,\beta _4) =(0.58,0.76,0.68,0.75)$, $\varvec{\lambda } =(\lambda _1, \lambda _2,\lambda _3,\lambda _4) =(0.011,0.056,0.189,0.188)$ and $\alpha =0.58$. Different sample size were considered: 100, 150, 200 and 500, and three censoring cases: $10\%$, $25\%$ and $50\%$. For each combination of $\varvec{\beta }$, $\varvec{\lambda }$, $\alpha$, censoring and sample size, we simulated 1000 replicates of the WPE2 regression model, and the parameter estimation was performed based on the ML method. In each case, we compute the bias, the root of the mean square error (RMSE), the mean of the estimated standard errors (SE), and the 95% coverage probability (CP). The results are summarized in Table 1. It is possible to observe a reduction in the bias for all the parameters and under any censoring scenario when the sample size is increasing. In each case, we compute the mean and the standard error. In addition, the RMSE and SE terms are closer, specially when the sample size is increased, suggesting that the variance of the ML estimators is reasonably well estimated. Finally, the CP terms are closer to the nominal value used to their construction, suggesting that the asymptotic distribution for the ML estimators in Eq. (10) is appropriated even in finite samples.

Table 1 Empirical bias, SE, RMSE and CP for ML estimators in the WPE2 regression model for different scenarios.

Full size table

Real data applications

Several international health organizations established that breast milk is one of the most important ways to ensure the survival and health of children. It is recommended, for infants, exclusive breastfeeding for up to 6 months, starting at the first hour of life, which is recommended by OMS. In this Section, we analyzed Breastfeeding data from Brazil³¹ and the United States¹⁷. The codes are available in a public GitHub repository (https://github.com/Ymgomez89/Proportional-Hazard-Model).

Breastfeeding from Brazil

Colosimo and Giolo³¹ provide a data set related to 150 mothers, including the time from birth to cessation of breastfeeding (in months), as well as four covariates associated with each. Censored observations occur when the breastfeeding record is cut off for different reasons. The 4 reported covariates are related to previous breastfeeding experience ($x_1=1$ for yes ($42\%$) and $x_1=0$ for no ($58\%$)), ideal breastfeeding time ($x_2=1$ for greater than 6 months ($47\%$) and $x_2=0$ for otherwise ($53\%$)), breastfeeding problems ($x_3=1$ for yes ($44\%$) and $x_3=0$ for no ($56\%$)) and exclusive breastfeeding with breast milk ($x_4=1$ for yes ($37\%$) and $x_4=0$ for no ($63\%$)). Figure 3 shows the Kaplan–Meier (K–M) estimator³² by the four covariates.

Based on the discussion provided in the introduction Section, we fix the partition time in 0, 1, 3, and 5 months (i.e., $L=4$). Table 2 shows the log-likelihood, Akaike information criterion (AIC; Akaike³³), and the Bayesian information criterion (BIC; Schwarz³⁴) values obtained for the fitted piecewise-type models (PE, PPE, WPE1, WPE2) with $L=4$. Based on those criteria, the WPE2 model is preferred.

Table 2 Log-likelihood, AIC and BIC for the PE, PPE, WPE1 and WPE2 models in brazilian breastfeeding data.

Full size table

The estimates for the WPE2 model are presented in Table 3. Based on the results, we can obtain the following conclusions (keeping in mind that the other covariates are kept fixed).

The risk of early weaning for mothers without previous experience in breastfeeding is $\exp (\widehat{\beta _1})\approx 1.6$ times higher in relation to who did have experience (95% confidence interval 1–2.8).
The risk of early weaning for mothers who think that the ideal breastfeeding time is less than or equal to 6 months is approximately $\exp (\widehat{\beta _2})\approx 1.7$ times the risk of mothers who believe the opposite. (95% confidence interval: 1 - 2.8).
The risk of early weaning in mothers who had breastfeeding problems is $\exp (\widehat{\beta _3})\approx 2$ times higher compared to those who did not experience problems. (95% confidence interval 1.2–3.3).
The risk of early weaning in infants who were not exclusively breastfed is $\exp (\widehat{\beta _4})\approx 1.9$ times greater than the risk of premature weaning in infants who are exclusively breastfed. (95% confidence interval 1.1–3.1).

Table 3 ML estimates and standard errors (in parenthesis) for the WPE2 model in Brazilian breastfeeding data.

Full size table

Breastfeeding from United States

We consider a second data set related to breastfeeding obtained from Klein¹⁷. These data were taken from a National Survey of Youth Labor (NLSY) in Ohio, which began in 1979 with youths aged 14–21, interviewed annually until 1988. Beginning in 1983, women were asked about any pregnancy that had occurred since the last interview (pregnancies prior to 1983 were also documented), recording breastfeeding information. The data set includes information on 927 firstborns whose mothers chose to breastfeed. The sample is restricted to children who were born after 1978 and had a gestation of 20–45 weeks in order to eliminate recall problems.

The main variable is the duration of breastfeeding (in weeks) and whether the baby was weaned (an indicator of failure). Explanatory variables include race, poverty status, maternal smoking and drinking habits, maternal age at birth, maternal education, and prenatal care. The only significant variables are race (categorical: white $(71\%)$, black $(13\%)$ and other $(16\%)$), poverty status ($x_2$: 756 non-poor (82%), 171 poor (18%)), and smoking habits ($x_3$; 657 non-smoking mothers (71%), 270 mothers who smoked after birth (29%)). Figure 4 shows the K–M estimator by the three covariates.

Table 4 shows that the AIC and BIC criteria for the same piecewise type models fitted in the last application. Again, based on those criteria the WPE2 model provides the best fit among the considered models.

Table 4 Log-likelihood, AIC and BIC for the PE, PPE, WPE1 and WPE2 models for second dataset.

Full size table

In addition, Table 5 presents the estimates for those models. Based on the results, we can obtain the following conclusions (keeping in mind that the other covariates are kept fixed).

The risk of early weaning in black race is $\exp (\widehat{\beta _{11}})=1.2$ higher than white race (95% confidence interval 1.0 - 1.5), and in another race it is $\exp (\widehat{\beta _{12}})=1.5$ times higher with respect to the white race (95% confidence interval 1.2 - 1.8).
The risk of early weaning of mothers in poverty status is $\exp (\widehat{\beta _2})=0.9$ times in relation to mothers without poverty status (95% confidence interval 0.7 - 1.0).
The risk of early weaning who smoked after the baby was born is $\exp (\widehat{\beta _3})=1.4$ times higher with respect to mothers who did not smoke after birth (95% confidence interval 1.2 - 1.6).

Table 5 ML estimates and standard errors (in parenthesis) for the WPE2 model in american breastfeeding data.

Full size table

Conclusions

The introduction of the Weibull piecewise distribution represents a significant advancement in flexible statistical modeling. By allowing for different shapes of the hazard function across multiple time intervals, this model accommodates the complexities often observed in real-world data where risks may change over time, such as breastfeeding data. The dual approach to time partitioning–either estimating it directly from the data or predefining intervals–provides versatility for its application. This flexibility enables researchers and practitioners to tailor the model to their specific data characteristics, enhancing its relevance and usability.

Simulation studies have demonstrated the effectiveness of maximum likelihood estimation for deriving parameter values across various sample sizes. Such empirical validation is crucial, as it affirms the robustness of the model when applied to different contexts. This aspect also highlights the model’s practicality, offering a reliable method for analysts facing diverse data challenges. Finally, the proposed distribution is applied to a real data set to show the potential of this new distribution.

In conclusion, we define a flexible approach to modeling lifetime datasets across multiple time intervals by compounding the classes of Weibull (which is generally adequate for modeling monotone hazard rates) and PE (which allows breaking points to be incorporated into the analysis) distributions. The compounding procedure follows by taking the Weibull family of distributions as the baseline distribution in (2). In addition, we motivate the use of the new distribution in breastfeeding data. We hope that these two facts combined may attract more complex applications in survival analysis. Further exploration of the model could involve applying it to different datasets to assess its performance and robustness compared to traditional distributions. Additionally, investigating potential applications in fields such as reliability engineering, survival analysis, and risk management could provide valuable insights into its utility and versatility.

Data availability

The datasets generated and analysed during the current study are available from the corresponding author on reasonable request.

References

Zhang, C., Zhao, W., Pan, X. & Huang, A. Effects of feeding patterns during the first 6 months on weight development of infants ages 0–12 months: a longitudinal study. Sci. Rep. 14, 17451 (2024).
Article CAS PubMed PubMed Central Google Scholar
Meek, J. Y., Noble, L. & on Breastfeeding, S. Policy statement: Breastfeeding and the use of human milk. Pediatrics150, e2022057988. https://doi.org/10.1542/peds.2022-057988 (2022).
Park, P. G., Heo, J. S., Ahn, Y. H. & Kang, H. G. Association between exclusive breastfeeding and the incidence of childhood nephrotic syndrome. J. Pediatr. 276, 114266–6 (2024).
Article PubMed Google Scholar
Kawan, M., Gregory, E. F. & Spatz, D. L. Improving breastfeeding care & support in a large, urban, pediatric primary care practice. J. Pediatr. Nurs. 70, e3-8 (2023).
Article PubMed Google Scholar
Shitie, A. et al. Breast problems and associated factors among lactating women in northeast ethiopia, 2022. Sci. Rep. 14, 9202 (2024).
Article ADS CAS PubMed PubMed Central Google Scholar
Oliveira, V., Prell, M. & Cheng, X. The economic impacts of breastfeeding: A focus on usda’s special supplemental nutrition program for women, infants, and children (wic). Economic Research Report ERR–261, U.S. Department of Agriculture, Economic Research Service (2019).
Bartick, M. C. et al. Suboptimal breastfeeding in the united states: Maternal and pediatric health outcomes and costs. Maternal Child Nutr. 13, e12366. https://doi.org/10.1111/mcn.12366 (2017).
Article Google Scholar
Organización Mundial de la Salud & UNICEF. Declaración conjunta del director ejecutivo del unicef y director general de la oms con ocasión de la semana mundial de la lactancia materna (2023).
Carter, S. K., Stone, A. & Cox, J. M. About research: Intersectionality in breastfeeding research. J. Hum. Lact. 39, 202–205, https://doi.org/10.1177/08903344231156446 (2023). Epub 2023 Mar 1.
Walsh, A. et al. Improving breastfeeding support through the implementation of the baby-friendly hospital and community initiatives: A scoping review. Int. Breastfeed. J. 18, 22. https://doi.org/10.1186/s13006-023-00556-2 (2023).
Article ADS PubMed PubMed Central Google Scholar
Gianni, M. L. et al. Breastfeeding difficulties and risk for early breastfeeding cessation. Nutrients 11, 2266. https://doi.org/10.3390/nu11102266 (2019).
Article PubMed PubMed Central Google Scholar
Lechosa-Muñiz, C., Paz-Zulueta, M., Cayón-De las Cuevas, J., Llorca, J. & Cabero-Pérez, M. J. Declared reasons for cessation of breastfeeding during the first year of life: An analysis based on a cohort study in northern spain. Int. J. Environ. Res. Public Health 18, 8414. https://doi.org/10.3390/ijerph18168414 (2021).
Article PubMed PubMed Central Google Scholar
Chang, P. C., Li, S. F., Yang, H. Y. et al. Factores asociados con el cese de la lactancia materna exclusiva a la posparto de 1 y 2 meses en taiwán. Int. Breastfeed. J. 14. https://doi.org/10.1186/s13006-019-0213-1 (2019).
Abusabah, A. M. et al. Prevalence and determinants of exclusive breastfeeding among mothers of children aged 6–24 months in the aseer region. Cureus 16, e66788. https://doi.org/10.7759/cureus.66788 (2024).
Article PubMed PubMed Central Google Scholar
Zitkute, V. et al. Reasons for breastfeeding cessation in the first year after childbirth in lithuania: A prospective cohort study. Medicina 56, 226. https://doi.org/10.3390/medicina56050226 (2020).
Article PubMed PubMed Central Google Scholar
Bagci Bosi, A. T., Eriksen, K. G., Sobko, T., Wijnhoven, T. M. & Breda, J. Prácticas y políticas de lactancia materna en los estados miembros de la región europea de la oms. Salud Pública y Nutrición 19, 753–764 (2016).
Google Scholar
Klein, J. P. & Moeschberger, M. L. Survival Analysis Techniques for Censored and Truncated Data (Springer, 1997).
Google Scholar
Feigl, P. & Zelen, M. Estimation of exponential survival probabilities with concomitant information. Biometrics 21 (1965).
Friedman, M. Piecewise exponential models for survival data with covariates. Ann. Stat. 10 (1982).
Durrans, S. R. Distributions of fractional order statistics in hydrology. Water Resour. Res. 28, 1649–1655 (1992).
Article ADS Google Scholar
Gómez, Y. M., Gallardo, D. I. & Arnold, B. C. The power piecewise exponential model. J. Stat. Comput. Simul. 10 (2017).
dos Santos Junior, P. C. & Schneider, S. Power piecewise exponential model for interval-censored data. J. Stat. Theory Pract. 16 (2022).
Alzaatreh, A., Lee, C. & Famoye, F. A new method for generating families of continuous distributions. METRON 71, 63 (2013).
Article MathSciNet Google Scholar
Alzaatreh, A. & Gosh, I. On the weibull-x family distributions. J. Stat. Theory Appl. 10 (2015).
Tahir, M. H., Cordeiro, G. M., Alzaatreh, A., Mansoor, M. & Zubair, M. The logistic-x family of distributions and its applications. Commun. Stat. Theory Methods 45 (2016).
Bourguignon, M., Silva, R. B. & Gauss, C. The weibull-g family of probability distributions. J. Data Sci. 12, 53–68 (2014).
Article MathSciNet Google Scholar
Team, R. C. R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria (2025).
Cox, D. R. Regression models and life-tables. J. Roy. Stat. Soc.: Ser. B (Methodol.) 34, 187–202 (1972).
Article MathSciNet Google Scholar
Soetaert, K. & Herman, P. A Practical Guide to Ecological Modelling. Using R as a Simulation Platform (Springer, 2009).
Book Google Scholar
Cox, D. R. & Hinkley, D. V. Theoretical Statistics (CRC Press, Boca Raton, 1979).
Book Google Scholar
Colosimo, E. A. & Giolo, S. R. Análise de sobrevivência aplicada (Editora Blucher, 2021).
Google Scholar
Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Am. Stat. Assoc. 53, 457–481 (1958).
Article MathSciNet Google Scholar
Akaike, H. A new look at the statistical model identification. IEEE Trans. Autom. Control 19, 716–723 (1974).
Article ADS MathSciNet Google Scholar
Schwarz, G. Estimating the dimension of a model. Ann. Stat. 6, 461–464 (1978).
Article MathSciNet Google Scholar

Download references

Acknowledgements

The work of the first author is partially supported by grant Fondo Nacional de Desarrollo Científico y Tecnológico (FONDECYT) 11230397.

Author information

Authors and Affiliations

Departamento de Estadística, Facultad de Ciencias, Universidad del Bío-Bío, Concepción, 4081112, Chile
Yolanda M. Gómez & Diego I. Gallardo
Departamento de Matemática, Facultad de Ingeniería, Universidad de Atacama, Copiapó, Chile
Wilson E. Caimanque & John L. Santibáñez
Escuela de Enfermería, Facultad de Ciencias para el Cuidado de la Salud, Universidad San Sebastián, Concepción, Chile
Rocío Zúñiga-Tapia
Departamento de Estatística, Universidade Federal do Rio Grande do Norte, Natal, RN, Brazil
Marcelo Bourguignon

Authors

Yolanda M. Gómez
View author publications
Search author on:PubMed Google Scholar
Wilson E. Caimanque
View author publications
Search author on:PubMed Google Scholar
John L. Santibáñez
View author publications
Search author on:PubMed Google Scholar
Rocío Zúñiga-Tapia
View author publications
Search author on:PubMed Google Scholar
Diego I. Gallardo
View author publications
Search author on:PubMed Google Scholar
Marcelo Bourguignon
View author publications
Search author on:PubMed Google Scholar

Contributions

Y.M.G. and D.I.G. conceived the idea presented. W.E.C. and J.L.S. developed the theory and performed the computations. Y.M., D.G. and M.B. verified the analytical methods. J.L.S. and W.E.C. performed the numerical simulations. Y.M.G. and R.Z. analysed the data. Y.M., R.Z., D.I.G. and J.S. prepared the draft manuscript. All authors reviewed the results and approved the final version of the manuscript. All authors approved the final manuscript as submitted and agreed to be held accountable for all aspects of the work.

Corresponding author

Correspondence to Yolanda M. Gómez.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

Reprints and permissions

About this article

Cite this article

Gómez, Y.M., Caimanque, W.E., Santibáñez, J.L. et al. A new proportional hazard model with applications to breastfeeding data. Sci Rep 15, 21869 (2025). https://doi.org/10.1038/s41598-025-08219-4

Download citation

Received: 14 January 2025
Accepted: 19 June 2025
Published: 01 July 2025
DOI: https://doi.org/10.1038/s41598-025-08219-4

Subjects

Abstract

Similar content being viewed by others

Investigating the impact of breastfeeding difficulties on maternal mental health

Breastfeeding is associated with reduction in postpartum depression in the United Arab Emirates: a retrospective cross-sectional study

Breastfeeding competency and its influencing factors among pregnant women in third trimester pregnancy: a cross-sectional study

Introduction

The new models

Case 1: \(W(R(t)) = \dfrac{R(t)}{1-R(t)}\)

Proposition 1

Proof

Case 2: \(W(R(t))=-\log \left( 1-R(t)\right)\)

Proposition 2

Proof

Proposition 3

Proof

The non-homogeneous case

Inference

ML estimation

Simulation study

Real data applications

Breastfeeding from Brazil

Breastfeeding from United States

Conclusions

Data availability

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Quick links