Introduction

Global petroleum exploration is increasingly focused on deep (> 4500 m) to ultradeep (> 6000 m) reservoirs1,2. Among these, carbonate reservoirs account for approximately 40% of the total proven oil and gas reserves and 60% of the hydrocarbons production worldwide3,4,5,6. Unlike the Mesozoic-Cenozoic carbonate reservoirs, deeply buried carbonate rocks have lost most of their primary pores, resulting in tight reservoirs with highly uneven distribution and significant heterogeneity7,8,9,10. This leads to significant variations in oil and gas production and complex production characteristics, posing challenges to the exploration and development of deeply buried hydrocarbon reservoirs. Fractures and associated dissolution can significantly enhance matrix porosity by over 50% and increase permeability by 1–3 orders of magnitude11,12,13,14. Therefore, karstified and fractured carbonate reservoirs have attracted considerable attention. In recent years, the largest ultra-deep fault-controlled carbonate oilfield has been found in the Halahatang oilfield of the Tarim Basin15 (Fig. 1). It has estimated 1 billion tons of oil and gas reserves along strike-slip fault zones, revealing the controlling role of strike-slip faults in reservoir formation, accumulation, and oil and gas enrichment6,15,16,17. Understanding the distribution and geometry of strike-slip faults is of great significance for the exploration and development of deep buried carbonate reservoirs. Previous studies have shown that the strike-slip fault system within the Ordovician carbonate rocks in the northern Tarim Basin has complex and diverse geometric features, including horsetail structures, overlaps, en echelon faults, and conjugate strike-slip faults18,19,20,21. Han et al.19 mapped the strike-slip faults in the Tazhong area using coherence and curvature attributes. Wu et al.20applied maximum likelihood attribute to detect the strike-slip faults in the Tarim Basin, revealing the fault segments linkage and growth evolution. Sun et al.22 identified the fault damage zones using gradient structure tensor attributes. Some researchers also suggested that different seismic attributes emphasize different aspects of faults23,24,25. To leverage the strengths of different attributes, Tian et al.24proposed to fuse multiple seismic attributes for identifying strike-slip faults. However, due to intense karstification and the complex assemblage of fractures, pores, and caves in carbonate reservoirs, conventional seismic attributes methods (such as coherence and curvature attributes) struggle to characterize the strike-slip fault system in this area22,25,26, which poses a challenge in reservoir description and target optimization. Therefore, it is crucial to enhance the identification and interpretation of strike-slip faults.

Fig. 1
figure 1

(modified from Ref52). ; (b) Regional geological map demonstrating the circum-Aman strike-slip fault system at the top surface of the Ordovician carbonates of the north Tarim Basin (the fault system was modified from Ref17. and the surface image was generated with software Geoscope http://www.chinarockstar.com/page3).

(a) Tectonic framework of the Tarim Basin.

With the rapid development of artificial intelligence, machine learning techniques, especially deep learning (DL), have been increasingly applied to solve seismic problems27,28, including noise attenuation29,30, seismic inversion31,32, the interpretation of geologic horizons33,34, salt bodies35,36, channels37, and faults38,39,40,41,42,43,44,45 in seismic data. DL techniques have shown superior performance compared to conventional seismic methods, particularly in fault interpretation. This advanced technique has never been implemented to enhance the interpretation of strike-slip faults in the Tarim Basin. The success of DL is usually supported by well-designed neural network architectures, allowing models to extract meaningful features from raw data46. The main DL architectures include deep neural networks (DNN), convolutional neural networks (CNN), recurrent neural networks (RNN), generative adversarial networks (GAN), long short-term memory networks (LSTM), and transformers28,47. Among them, CNN extracts spatial features from images through convolutional and pooling layers, making it particularly effective for classification and segmentation tasks. Unet, a variant of CNN, is a classic U-shaped segmentation network, featuring an encoder for feature extraction and a decoder with transpose convolutions for upsampling48. The majority (86.1%) of network structures used in fault interpretation are Unet28,47. Wu et al.38 first introduced the Unet network for automatic fault detection. Liu et al.42 added another residual block in Unet architecture to highlight seismic faults. An et al.43 compared the performance of various network structures for fault detection. Lin et al.49 introduced a channel attention model in the Unet architecture, which reduces computational speed and improves fault detection efficiency.

Despite the above advantages, DL technology presents some drawbacks. As a supervised learning method, it requires a large number of labeled fault samples for training, and the Unet model is often pre-trained using synthetic data28,38. Synthetic data is favored because it allows for precise control over the consistency between seismic data and labels, making it commonly used in training datasets. However, this may perform poorly when directly applied to actual seismic data, as synthetic data cannot fully characterize all types of faults in actual seismic data28,47,50. It is difficult to obtain a large number of labeled fault samples in actual seismic data. Some researchers suggested using transfer learning techniques to address the generalization problem50, but obtaining appropriate fault label remains a challenge. Additionally, artificial fault interpretation bias is inevitable. Selecting appropriate fault labels is crucial for transfer learning28. Previous studies primarily focused on optimizing the network structure, and how to generate appropriate fault labels for actual seismic data has not been specifically addressed.

Inspired by the above advanced techniques and considering the complex geological background of the northern Tarim Basin, we proposed a method for constructing fault labels in actual seismic data and introduced a deep learning workflow to detect the strike-slip faults of Ordovician carbonate in the Halahatang oilfield in the northern Tarim Basin. This method has two advantages. Firstly, the method of constructing fault labels based on seismic-well combination can greatly avoid manual interpretation bias and obtain objective results. Secondly, the application of transfer learning solves the problem of insufficient actual fault labels, requiring only some manually interpreted annotated samples as training dataset. This preserves the superior performance of the base DL model and takes into account the geological conditions of the field data.

Geological setting

The Tarim Basin, located in northwest China, is the largest superimposed basin of China and covers an area of approximately 56 × 104km2, where bounded by the Tianshan Mountain in the north and the Kunlun Mountain in south, and to the southeast lied the Arjin Mountain51,52. According to the tectonic framework, the basin can be divided into 12 principal structural units (Fig. 1a). The studied Halahatang field belongs to the North uplift, covering an area of 4,500 km2 (Fig. 1b), and contains 16, 000-m-thick Cryogenian-Quaternary sedimentary strata8,17.

The stratigraphy of the north Tarim Basin comprises Cambrian to Middle Ordovician marine carbonates (Fig. 2), which are overlain by a thick layer of Upper Ordovician argillaceous limestone and mudstone, and Silurian clastics8,9,52. The Ordovician system primarily deposited on a carbonate platform and has been divided into the Lower Ordovician Penglaiba and the Yingshan formations, the Middle Ordovician Upper Yingshan and Yijianfang formations52,53. In the Upper Ordovician, the Sangtamu, Lianglitag, and Tumuxiuke formations are partially absent, gradually pinching out northward and unconformably overlain by the Silurian system17,53,54. The Middle to Upper Ordovician carbonates, with a burial depth of 6000–7500 m, constitute the main oil-bearing strata9. The reservoir rocks primarily consist of light sandy limestone, micritic limestone, and oolitic limestone55,56. Most primary pores in carbonate reservoirs have been lost during burial. The secondary pores, fractures, and caves serve as the main storage spaces12,20,55, exhibiting high heterogeneity. Three sets of pre-Mesozoic source rocks were developed in the Halahatang field: the Precambrian, Lower Cambrian, and Middle-Upper Ordovician53,54,57 (Fig. 2). Three major stages of oil and gas accumulation were proposed: oil emplacement during the late Caledonian (Ordovician-Silurian) and late Hercynian (Permian), and gas accumulation during the late Himalayan (Neogene)54,57,58.

Fig. 2
figure 2

(modified from Ref52).

Stratigraphic column of the north Tarim Basin. Seismic reflecting surfaces (horizons) and the timing of tectonic movements are shown.

The Tarim Basin, developed on the Precambrian crystalline and folded basement59, has experienced multiple tectonic movements and is intersected by multiple faults. Under a weak extensional background, a large-scale Cambrian-Ordovician carbonate platform formed in a cratonic basin55,60. During the Middle Ordovician, intense compression along the southern margin of the Tarim Basin, influenced by the Proto-Tethys subduction, led to the formation of E-W striking paleouplift within the cratonic basin51,59,60. Subsequently, the E-W oriented carbonate platform emerged across the northern and central parts of the basin during the late Ordovician (Fig. 1b). Multiple unconformities and fault activities occurred in the north uplift of the Tarim Basin from the Ordovician to the Eocene, resulting in extensive karstification in the Halahatang area during the Late Caledonian period (Late Ordovician-Silurian) and Hercynian events (Carboniferous-Permian)16,61,62. Based on karstification intensity and stratigraphic erosion, the Halahatang area can be divided from south to north into the northern buried hill area, the middle slope transitional area, and the southern deep burial area58,62. The thickness of the Ordovician carbonate rocks gradually decreases northward (Fig. 3), while karstification intensifies, resulting in a rich and typical karst landform, including karst highs, karst valleys, karst channels, and sinkholes62.

Fig. 3
figure 3

(a) Arbitrary seismic Inline; (b) Arbitrary seismic Xline (flattened by surface C), (c) Coherence slice (surface O3) display the typical seismic response characteristics of the Halahatang oilfield, which hinders the detection of strike-slip faults. The blue arrows indicate the karst valley. The pink ellipse and arrows indicate the bead-like reflection.

Conjugate strike-slip faults are extensively developed in the Lower Paleozoic carbonates, predominantly featuring compressional deformation6,26 (Fig. 1). Previous studies suggested that these faults formed during the Late Cambrian, with the development of conjugate faults taking place during the Middle to Late Ordovician. The faults were inherited in the Silurian-Devonian and partial reactivation during the Carboniferous-Permian19,20,21,26,52,63. Regional compressive stress controls the formation of strike-slip faults, with fault linkage being the main mechanism for their growth20,26. Due to intense karstification, the geometry of strike-slip faults on the top surface of carbonates is typically inferred from deeper strike-slip faults22,25,26. From seismic sections (Fig. 3a–b), it can be seen that strong reflection anomalies occurred in the O3-O1 − 2 intervals, marked by the pink ellipses. These strong reflections are typically narrow laterally, ranging from 20 to 50 m, but extend vertically from 200 to 500 m, with some reaching up to 1 km. Due to their narrow lateral and extensive vertical characteristics, these reflections are referred to as bead-like reflections, which were caused by large cavities formed by intense carbonate dissolution19,26. The bead-like reflections were mainly developed in the Ordovician carbonate rocks, especially in the northern intense karst areas. At the O3 interface, distinct convex and downward-shaped events are visible, which are formed by karst channels and landforms. These observations highlight the presence of extensive karstification processes in the study area, resulting in the formation of karst-related features and distinctive seismic responses within the Ordovician carbonate rocks. This karstification and bead-like reflection hinder the identification of strike-slip faults in Ordovician carbonate rocks (Fig. 3c).

Dataset and methodology

This study utilizes Prestack Depth migration 3D seismic data reprocessed in 2023, provided by PetroChina Tarim Oilfield Company. Covering approximately 3500 km2 with an inline and crossline (trace) spacing of 25 m, the 3D seismic survey captures dominant frequencies in the Ordovician carbonate strata ranging from 22 to 27 Hz, posing challenges for fault mapping and description under intense karstification.

The proposed DLworkflow for fault detection comprises three parts (Fig. 4): (1) A Unet network (Fig. 5), pre-trained with synthetic data (See Ref38. for method), serves as a base model to generate an initial fault probability volume. (2) Modeling single well fault zones using electrical image logging data (FMI), priority directional or horizontal wells. This modeling is then compared with the initially generated fault probability volume to accurately assign appropriate fault labels (Fig. 6). (3) Fine-tuning the base model through transfer learning. Once the Unet model is optimized, it can be reused for fault detection and analysis.

Additionally, coherence and maximum likelihood attribute were calculated to compare the performance of the DL method with conventional seismic attribute methods. The specific principles of these attributes are detailed in the literature25,64,65.

Fig. 4
figure 4

Integrated workflow diagram outlining the methodology of generating the fault probability volume by Unet-based transfer learning.

Fig. 5
figure 5

Architecture of Unet, with different arrows representing the following: black arrow—standard 3 × 3 convolution layer, green arrow —maxpooling layer, yellow arrow—upsamping layer, black arrow—copy and concatenate layer. (modified from Ref38).

Fig. 6
figure 6

Workflow diagram outlining the methodology of constructing fault sample in actual seismic data by integration of seismic and well data.

Architecture of Unet

Since Unet is currently the most widely used model for fault identification28,47, it has been selected for fault identification in this study. The architecture of Unet, as shown in Fig. 5, was initially proposed by Ronneberger et al. (2015) for 2D medical image segmentation tasks48. Wu et al. (2019a) were the first to introduce this network to fault detection38. This network has a U-shaped symmetric structure, consisting of an encoder, a decoder, and a skip connection block. The encoder performs feature extraction and downsampling on the seismic data. The decoder, on the right side, performs upsampling on the extracted features and maps them to fault probability. Skip connections in the encoder aggregate multiscale semantic features from both shallow and deep layers to enhance the predictive accuracy of the results. As shown in Fig. 5, the encoder consists of three convolutional blocks, each containing two 3 × 3 × 3 convolutional layers, two ReLU activation layers, and a max-pooling layer. Symmetrically, the decoder also consists of three convolutional blocks, each containing two 3 × 3 × 3 convolutional layers, two ReLU activation layers, and an upsampling layer. Three skip connections of different scales are employed between the encoder and decoder to transmit detailed features.

Constructing fault labels

Seismic-geological modeling studies revealed that strike-slip fault zones consist of fault core and fault damage zones6,22,66,67. Fault core occurs in severely deformed regions accommodating the majority of deformation, including fault gouge, breccia, or cave. On seismic sections, it typically displays discontinuous and chaotic reflectors with variable amplitude22,25,64. Within the fault damage zone, multiple sets of fractures are well developed, which appear as weak amplitude and medium moderate continuity on seismic Sects6,25. Fault zones exhibit higher strain and deformation intensities compared to the surrounding rock. The above studies indicate that fault zones present ambiguous responses on seismic profiles, making identification challenging, whereas modeling fault zones using FMI data achieves high accuracy68. The cumulative fracture density method is commonly employed to identify the boundary between the fault zone and surrounding rocks67,68. Therefore, we utilize imaging log for fault zone modeling (Fig. 6). Initially, FMI data is utilized to interpret the distribution of wellbore fractures, and generate cumulative fracture density curves. Based on the varying slope gradient of these curves, the boundary between fault zones and country rock can be delineated. The modeling results of fault zones are then compared with the pre-generated fault probability volume. Only the single well fractured zones that align with the fault probability volume are retained, while non-corresponding segments are removed, thereby obtaining fault labels.

Taking the H11-1 horizontal well as an example, the fracture density curve was conducted using the FMI data (step 2 in Fig. 6). The fracture density in the tight surrounding rock ranges from 1 to 2 m− 1, but between 6623 m and 6720 m, the fracture density suddenly increases to 3–8 m− 1. Geological modeling indicates that this increase is due to the well intersecting a fault zone. Within this fault zone, the fracture density follows an approximately Gaussian distribution, with densities peaking at 7–8 m⁻¹ at 6680 m and 6650 m, gradually increasing from the edges toward the center. Between 6678 m and 6679 m, there is a lack of fractures, which, from the FMI image, appears as a large dark-colored block, corresponding to a cavity formed by fault core dissolution. Based on the FMI interpretation, a fault model of the well was established, indicating that the well intersects the fault zone between 6623 m and 6720 m, with the fault core located at 6678–6679 m. In contrast, other sections of the well exhibit low fracture densities, suggesting no fault intersection. The initial fault probability volume predicted two faults in this well (step 3 in Fig. 6), which was inconsistent with the FMI interpretation results. Therefore, fault in the 6800–6900 m interval were excluded during fault sample labeling (step 4 in Fig. 6).

This study utilized FMI data from 16 directional and horizontal wells to constructed fault zone models and annotated fault samples. Our results suggested that the fracture density in fault zones generally ranges from 5 to 8 m− 1. The modeling in the northern karst area shows a poor match with the initial fault probability model, while the southern region exhibits a better match. These findings may provide valuable insights for other studies.

Result

Deep transfer learning enhanced identification of strike-slip faults

In seismic sections, coherence, a conventional seismic attribute, can only detect significant anomalies near the O3 and Є3 interfaces (Fig. 7a). No apparent attribute anomalies exist between O3–O1 − 2 and O1 − 2– Є3, related to the scale of the fault. Moreover, the O3 surface is strongly affected by karstification, resulting in bead-like reflections and significant attribute anomalies. While effective in identifying large faults in areas lacking significant bead-like reflections, this attribute struggles to recognize smaller faults with prominent bead-like reflections. The Maximum Likelihood attribute (Fig. 7b) identifies more small faults, offering higher resolution than coherence. However, these small faults may include false identification, challenging the distinction of real faults and understanding their spatial relationships. The detection of Unet pretrained with synthetic data (UNPS) method (Fig. 7c) effectively addresses this issue, providing better imaging of fault planes, and identifying both main and secondary faults with higher resolution. However, block-like anomalies affected by bead-like reflections still exist, as seen in the pink circle in Fig. 7c (interval 6500–7500 m). The detection of the Unet-based transfer learning (UNTL) attribute (Fig. 7d) proposed in this study achieves the best continuity and resolution in fault identification while suppressing attribute anomalies caused by bead-like reflections. It assists in interpreting the extent and combination of faults.

Fig. 7
figure 7

(a) Coherence section; (b) Maximum likelihood section; (c) Fault probability section computed by Unet trained with synthetic data (UNSD); (d) Fault probability section computed by Unet-based transfer learning (UNTL) in the Halahatang oilfield (see ___location in Fig. 8 and interface code in Fig. 2).

Figure 8 presents the map view of the coherence attribute (Fig. 8a), the maximum likelihood attribute (Fig. 8b), the UNPS detection (Fig. 8c), the predictions of Unet pretrained with several actual fault labels (Fig. 8d), and the UNTL detection (Fig. 8e) in the Halahatang area. The conventional coherence attribute fails to identify fault features due to strong karstification and bead-like reflections, instead identifying numerous curved and meandering karst channels, as well as circular or columnar point-like features (Fig. 8a). In contrast, the maximum likelihood attribute identified a NE-trending fault along the H13-H16-9 well to some extent, suppressing non-fault features, yet its performance remains poor in areas with severe karstification, showing ambiguous blocky anomalies (Fig. 8b). This significantly increases the occurrence of false fault detection and makes fault identification more challenging.

Fig. 8
figure 8

(a) Plane view of coherence; (b) Plane view of maximum likelihood; (c) Plane view of the UNSD attribute; (d) Plane view of the fault probability computed by using Unet trained with few actual fault samples; (e) Plane view of the UNTL attribute of the O3 reflecting interfaces in Halahatang oilfield (see ___location in Fig. 9).

Compared to conventional seismic attributes, the fault prediction results based on Unet effectively suppress non-fault features and clearly depict linear fault characteristics, which is beneficial for the identification of both primary and secondary strike-slip faults (Fig. 8c–e). Specifically, the UNPS attribute (Fig. 8c) mitigates the effects of karst channels and bead-like reflections, strengthening the linear characteristics of main strike-slip faults, particularly along the H13-H16-9 well area, the NE-striking fault on the east side of well H601, and the NW-striking fault where R6-1 is located. However, in areas with intense karstification, influences from karst channels and bead-like reflections persist, such as the karst channel of well HA8-1, and the western side of wells H13 and H6. In these areas, the linear features of faults are not prominent, resulting in ineffective fault identification. The UNTL attribute (Fig. 8e) demonstrates the best performance, providing the clearest identification of main strike-slip faults. This method effectively suppresses the interference from karst channels and bead-like reflections, enabling the identification of multiple conjugate NE-trending and NW-striking strike-slip faults. Moreover, it also detected several secondary faults close to primary faults. Additionally, the UNPS attribute demonstrates poor performance, it shows a limited number of major faults with weak continuity, which may give rise to incorrect fault detections, and cannot detect secondary faults.

Geometry of the strike-slip fault

Based on the UNTL attribute, the faults distribution in map view and their structural styles in section view were studied as follows:

The UNTL attribute slice of interface O3 reveals two strike-slip fault sets roughly striking NE and NW (Fig. 9). These faults, intersect to form diamond-shaped patterns, constituting a conjugate strike-slip fault system. Among the principal strike-slip faults, subsidiary minor faults further complicate the fault system. The strike-slip faults exhibit classic strike-slip fault structural styles including linear structures, oblique arrays, overlapping, pull-apart structures, horsetail structures, and also display both full and incomplete sets of conjugate faults. Conjugate strike-slip systems prevail in the northern part of the study area, while the southern part is dominated by simple shear. These can not be observed in conventional seismic attributes, as karstification superimposition results in the formation of numerous bead-shaped reflections and karst channels along these faults. These reflections and channels appear as block and point anomalies in coherence attributes rather than exhibiting the linear characteristics of faults.

Fig. 9
figure 9

Plane view of coherence fault probability computed by Unet-based transfer learning of the O3 reflecting interfaces in Halahatang oilfield.

The UNTL attribute, co-rendered with the 3D seismic data, reveals different fault styles in seismic sections, exhibiting varying degrees of fault connectivity (Fig. 10). Strike-slip faults predominantly develop within the Paleozoic carbonates, characterized by subvertical fault plane. They display various characteristics along their longitudinal direction, interact with each other between different segments, and form four composite styles. The first type is thorough going type (Fig. 10a), where the faults extend directly from deep layers to the top surface of the Ordovician system without any interruption. These faults typically exhibit significant displacement (40–80 m). The second type is the hard linkage type (Fig. 10b), where two or more isolated faults are stacked vertically. These faults were connected by linking faults between different segments, occasionally resulting in overlapping between segments. The third type is the soft linkage type (Fig. 10c), where multiple isolated faults exist in both deep and shallow layers, and connecting faults are absent. The deformation patterns are inconsistent across segments, typically manifesting as weaker deformation in shallow layers and stronger deformation in deeper ones. The fourth type is the isolated type (Fig. 10d), where faults generally develop within the Ordovician system, and different fault segments do not interfere with each other.

Fig. 10
figure 10

Fault probability, co-rendered with seismic data, showing the detailed style of strike-slip faults in Halahatang oilfield (see ___location in Fig. 9 and interface code in Fig. 2).

Discussion

Reliability of the UNTL attribute

We attempt to evaluate the reliability of UNTL attribute from the perspective of conventional seismic attributes and well consistency.

The coherence attribute is unable to accurately represent the geometry of strike-slip faults due to the significant interference caused by Ordovician karstification. Conversely, within the deep Cambrian system where karstification is not developed and distinct bead-like reflections are absent18,22,62, we computed the coherence and UNTL attributes of the interface of Є3 (Fig. 11). Both the coherence and the UNTL attribute map reveal the presence of a fault striking approximately NW on the western side of wells H6011 and H13, respectively. The overall orientation of these two attribute features is similar, indicating a high degree of correlation between deep learning and conventional seismic attributes when the seismic response characteristics of the fault are more obvious. Furthermore, in the UNTL attribute map, the NE-trending strike-slip fault between wells HA16-9 and H13 (Fig. 11b), as well as the NW-striking fault near well R6-1, appear clearer and continuous compared to the coherence attribute (Fig. 11a). The secondary smaller faults near the main fault are also highlighted. These observations suggest that the deep learning attributes have a higher resolution compared to conventional coherence attributes.

Previous studies suggested that the Ordovician limestone in the Halahatang area is tight, with porosity less than 3% and permeability less than 0.5 mD15,17,18. Strike-slip faults play a significant controlling role in the Ordovician carbonate, and oil production mainly comes from fracture reservoirs along the fault damage zones18. Oil production decreases with increasing distance from the strike-slip faults13,15,21,22. Therefore, we overlaid the UNTL attribute with high-production wells (cumulative production greater than 30,000 t), medium-production wells (cumulative production greater than 10,000 t), and dry wells (Fig. 11). The results show that the majority of the high-production and medium-production wells are located near the strike-slip faults (Fig. 11), while low-production wells are either not associated with faults or located far away from them. This indicates a good agreement between the UNTL attribute and the well data.

Fig. 11
figure 11

(a) Plane view of coherence; (b) Plane view of the UNTL attribute of the 3 reflecting interfaces in Halahatang oilfield (see ___location in Fig. 9).

Overall, it is believed that the UNTL attribute has high reliability and resolution compared to conventional seismic attributes.

Deep learning for strike-slip fault imaging

Conventional seismic attributes, such as coherence24,69, curvature70, maximum likelihood71, gradient structural entropy22 detect faults by estimating the discontinuities between seismic traces. Over the past few decades, fault interpretation relied on manual interpretation combined with these seismic attributes, which is time-consuming and subject to interpreter biases28. The interpretation results vary significantly among different interpreters. Meanwhile, due to the principle of the algorithm, these seismic attributes can also detect other geological features such as channels, volcanoes, and carbonate karst, which complicate the detection of faults24. While the maximum likelihood attribute (Fig. 7b) exhibits high resolution among conventional attribute (Fig. 7a) and successfully identifies numerous small faults, it may also identify false faults, posing challenges in distinguishing true faults and comprehending their spatial relationships. For instance, to the north of wells H6-H6011, the maximum likelihood attribute detected numerous anomalies (Figs. 7b and 8b). Although these anomalies appear as linear features similar to faults in section (Fig. 7b), their disordered appearance on the map makes it difficult to determine whether they represent faults (Fig. 8b).

Traditional fault interpretation methods often operate at 1D sample or rely on human visual perception, which is particularly adept at discerning 2D features. Recent research has highlighted that faults are not simply 2D planar structures but rather volumetric 3D cubes6,20,66,67, and relying on 2D approaches may lead to overlooking faults that are not observable from a single perspective28,72,73. In contrast to these conventional fault interpretation techniques, DL methods have the capacity to simultaneously consider a series of 2D or even 3D samples target, thereby extracting high-dimensional and complex abstract features. These features render the process less sensitive to noise or other geology bodies28,41,74,75. Consequently, DL methods yield smoother and more continuous fault detections compared to conventional methods (Fig. 7c–d), especially when dealing with noisy seismic data38.

Although the UNPS attribute shows significant improvements compared to conventional seismic attributes (Figs. 8 and 9), there are still some issues with fault detection in areas of intense local karstification. This may be due to the fact that the synthetic training dataset primarily considers factors such as folds, faults, noise, and dominant frequencies of seismic data, while other geological factors, such as karstification, which can generate a fault-like seismic response, are neglected. As a result, the generalization ability of the DL model is limited. Nonetheless, this type of synthetic training dataset yields significantly better fault detection results in most simple geological settings compared to conventional seismic attributes38,43,44,50,76. Directly training the DL model with actual fault labels still leads to unsatisfactory detection results (Fig. 8d). This could be attributed to the limited number of faults labels, making it difficult to meet the training requirements and resulting in overfitting. The workflow proposed in this study can address this limitation by utilizing a small number of actual faults labels and further optimizing the network model through transfer learning. The UNTL attribute successfully suppresses the impact of karstification, highlighting strike-slip faults in deep surfaces and achieving the best performance (Fig. 8e).

Therefore, it is recommended to incorporate the DL methods as part of an effective workflow to enhance the interpretation of strike-slip faults in deep subsurface, especially under geologically complex setting. This approach reduces subjectivity in interpretation and aids in understanding the distribution of strike-slip faults and structural patterns. Moreover, with the emergence of new DL models and the expansion of fault labels of actual seismic data, the accuracy of identification and computational efficiency will further improve.

Implication for oil accumulation

Geochemical studies have indicated that the hydrocarbon source rocks for the Ordovician carbonate oil and gas in the northern Tarim Basin include Lower Cambrian and Middle-Upper Ordovician source rocks53,54,57,77. There has been ongoing controversy regarding whether the main hydrocarbon source rocks are from the Lower Cambrian or the Middle-Upper Ordovician. With an increase in the number of drilling wells in the north Tarim Basin, most wells have not encountered Ordovician source rocks. Instead, high-quality shale source rocks have been found in the Lower Cambrian Yuertusi Formation6,15,17,63,78. Furthermore, high-production wells are primarily distributed along fault zones, while wells located away from fault zones tend to yield water or dry wells13,18, suggesting that Middle-Upper Ordovician source rocks may not be widely present. These observations indicate that the black shale of the Lower Cambrian is the most effective hydrocarbon source rock, rather than the Ordovician source rocks16,56,79,80,81,82. The influence of faulting on hydrocarbon migration has received significant attention. Steep and subvertical strike-slip faults can connect the deep Yuertusi Formation source rocks with the Ordovician carbonate rocks, forming efficient hydrocarbon migration pathways. Geochemical data also support the migration of oil and gas from the Cambrian to the Ordovician carbonate reservoirs16,56,82. The formation of strike-slip faults predates the major hydrocarbon accumulation period, with only localized faults reactivating after the Carboniferous15,56,58. Consequently, these faults are conducive to hydrocarbon migration.

Previous studies on the strike-slip faults in the Harahatang oilfield primarily relied on conventional seismic attributes and incorporated seismic attribute of the deep layer to mitigate the impact of intense karstification on the top surface of the Ordovician carbonate rocks15. Therefore, previous studies suggested that the strike-slip faults in this area are characterized by subvertical fault plane and flower structures, emphasizing the correlation between fault styles and segmented in map view of oil and gas migration15,22,58. Through DL attribute analysis, four structural styles of strike-slip faults with significant vertical segment characteristics were recognized, playing a different roles in hydrocarbon accumulation. The continuity of the faults determines the accumulation and enrichment of oil and gas. Vertically continuous faults favor the accumulation of hydrocarbons in the Ordovician carbonate rocks. However, isolated or soft-linkage faults may have limited control over hydrocarbon accumulation. High-yield wells (such as wells HA12-12, H9-12, and H16-9) are predominantly located at the thorough type and hard-linked faults, while wells located at isolated fault segments or soft-linkage type generally exhibit low or no production (Fig. 12). This suggests that hydrocarbon migration closely related to the vertical connectivity types or interactions between segments of strike-slip faults. This study highlights the significance of the continuity of strike-slip faults in hydrocarbon migration and accumulation.

Fig. 12
figure 12

Comparison of the high-productivity wells and low-production wells displayed by the UNTL attribute, co-rendered with seismic data (see well ___location in Fig. 9 and interface code in Fig. 2). The strike-slip faults holding the high-production wells generally exhibited a thorough and hard linkage type.

Conclusions

In this study, a method for constructing fault labels in actual seismic data was proposed for the first time. We fully consider the advantages and limitations of deep learning and use a transfer learning technique to overcome the problems of insufficient fault labels and generalization ability. The workflow was applied in the Halahatang Oilfield, and the main conclusions of this study are as follows:

  1. (1)

    The fault labels construction method based on seismic-well tie offers objective labels, reduces human bias, and provides accurate training data for the DL model.

  2. (2)

    Compared to conventional seismic attributes and detection computed by Unet pre-trained with synthetic data, the proposed method is favorable for the detection of strike-slip faults under a complex geological setting, effectively suppressing non-fault features and yielding higher precision in fault recognition, accompanied by clearer imaging of fault planes.

  3. (3)

    The degree of continuity of strike-slip faults is strongly correlated with hydrocarbon enrichment. Faults that are thorough or hard-linkage type favor hydrocarbon accumulation, whereas soft-linkage and isolated faults hinder accumulation.

This method enables the identification of hidden strike-slip faults in complex geological conditions using a small number of actual fault labels, providing valuable support for well trajectories and development plans. The results and workflow derived from this case study offer valuable references for fault identification and integrated analysis in diverse regions with complex geological conditions.