Source rock characterization and oil grouping in the NW Java, Central Java and NE Java Basins, Indonesia

This study reveals the detailed organic geochemistry from crude oils (acquired from wells and seepages) and rock extracts from NW Java and NE Java Basin that have been gathered and compiled from previous publications. The interpretation was conducted from geochemical data value and plot, GC-MS fingerprints, and agglomerative-hierarchical cluster analysis using the Euclidean algorithm. Various source rocks from those basins were deposited under fluvio-lacustrine to the marine environment. Six groups of crude oils are also distinguished. Groups 1, 2, and 6 are oils from deltaic source rocks, Groups 3 and 4 are oils from marine source rocks, and Group 5 is from lacustrine and/or fluvio-lacustrine source rocks. Groups 1, 2, and 6 could be distinguished from the pristane/phytane (Pr/Ph) ratio and C29 sterane composition, while Groups 3 and 4 differ from the distribution of C27 sterane. The schematic depositional environment of source rocks is also generated from this study and suggests that Group 5 is deposited during early syn-rift non-marine settings, while the remaining groups are deposited in the deltaic (Group 1,2 and 6) and marine settings (Groups 3 and 4). The main differences between those groups are including the distributions of C27-C28-C29 steranes.


INTRODUCTION
A series of back-arc basins in Java within Sundaland (Figure 1) has been known for their prolific oil and gas accumulation. They lie from west to east, from NW Java Basin to NE Java Basin. These basins shared roughly similar tectonostratigraphic history during Cenozoic, which consists of Cretaceous-Early Paleogene? pre-rift, Late Eocene-Oligocene synrift, Early Miocene postrift, and Middle Miocene inversion (Satyana and Purwaningsih, 2003;Doust and Noble, 2008). These similarities caused the similar type and deposition of source rocks which were deposited during early and late synrift basin phases. However, the heterogeneities within individual source rocks might cause the different compositions of crude oil and natural gas. This paper analyses the detailed organic geochemistry of crude oils from various basins in Java based on their composition and biomarker characteristics compiled from various publications.

PREVIOUS WORKS
Previous works on oil-to-source rock correlation in several basins in Java mostly emphasized on conventional geochemical analysis from biomarker and isotope data with quantitative (geochemical bivariate plot) and qualitative (comparing fragmentogram peaks) methods (e.g. Satyana and Purwaningsih, 2003;Wiloso et al., 2008;Devi et al., 2018 in NE Java Basin;Subroto et al., 2008;Praptisih, Figure 1. Back-arc basins in Java, Indonesia. The oil samples for this paper were taken from the NW Java Basin and NW Java Basin (green areas indicate producing basin), while the remaining basins are not studied. Some authors suggest that NW Java Basin is a bigger basin that contains several sub-basins including Sunda-Asri, Billiton, and Ardjuna. In this paper, the classification for the basin in Indonesia uses one from Doust and Noble (2008). 2018 in Central Java; Ponto et al., 1988;Rahmad (2016) in NW Java Basin].
However, hierarchical clustering analysis (HCA) is rarely performed since not all samples were analyzed using full range geochemistry analysis from bulk composition to stable carbon isotope. The other reason is that not all samples contain a similar biomarker, or the biomarker is coeluted and difficult to be calculated.
In NW Java, Napitupulu et al. (1997) differentiate three groups of oils from lacustrine, marine shales, and marine carbonate source rocks. In the eastern part of NW Java Basin (near Semarang, Central Java), oil seepage was studied by Praptisih (2018). In East Java, Devi et al. (2018) suggested one oil family from deltaic to marginal marine source rock in NE Java Basin. Satyana and Purwaningsih (2003) discuss the geochemistry of oils and extracts from East Java Basin from 100 wells and seeps (86 crude oils, 35 natural gases, and 57 rock samples from onshore and offshore areas). This study suggests that there are two "classes" of oils based on various geochemistry parameters and plots, however, the hierarchical cluster analysis has yet to be done (Satyana and Purwaningsih, 2003). The classification that they used was one from BP Research Center (1991, in Satyana andPurwaningsih, 2003) which enables oils to be put into one of five categories: A (marine nonclastic with mixed algal and bacterial input), B (marine clastic with mixed algal and bacterial input), C (lacustrine algae), D (high resin terrestrial) and E (low resin terrestrial), depending upon their bulk properties. Satyana and Purwaningsih (2003) classify oils in East Java into two: Class D1 and Class D2.
Other work that incorporated HCA and principal component analysis (PCA) have been done by Sosrowidjojo (2011) in the Sunda-Asri Basin and Ramadhina et al. (2017) in the NE Java Basin. The HCA and PCA performed by Sosrowidjojo (2011) indicates that oils in the Sunda-Asri Basin can be classified into 6 clusters that reflect their different geographical locations and not due to differences in source rocks. These 6 clusters consist of North Oil Field (2 clusters), Central Oil Field (2 clusters) and South Oil Field (1 cluster) and 1 mixed oil group. In the NE Java Basin, Ramadhina et al. (2017) classify the crude oils from published data into 3 main groups that reflect their subtle differences in API gravity and "final" depositional environment, suggesting the initial source rocks are similar (of non-marine origin) and then possibly were transported and deposited on different places from terrigenous to transition and marine. Unlike the work done by Sosrowidjojo (2011) and Ramadhina et al. (2017), the current study attempts to do HCA on crude oils and source rocks from across several basins.

DATA AND METHOD
Several publications from NW Java Basin and NE Java Basin (Figure 1) that contained geochemistry of crude oil, rock extract, and oil seepages were studied, analysed, and compiled to understand (1) source rock facies and depositional environment, (2) oil geochemistry from those basins, (3) classification and groups of oils based on their geochemistry parameters using hierarchical cluster analysis, and (4) schematic depositional environments of each group of source rock.
In this study, several biomarkers were analysed, including pristane and phytane ratio (Pr/Ph), n-alkanes distribution, C27-C28-C29 steranes ternary diagram, and fingerprint of selected steranes and terpanes. Table 1 explains those parameters and their interpretation.
Each publication mentioned above has a different range of geochemistry analyses, however, most of them have at least one or two parameters (e.g., pristane and phytane ratio and C27-C28-C29 steranes distribution) that are of important source-related biomarker. Therefore, cluster analysis is also used to understand the similarity and/or opposition of each sample based on geochemistry characters and tectonostratigraphy.

Hierarchical Clustering Analysis (HCA) for Oil Grouping
Cluster analysis is one of the common methods for geochemists to classify oils. Peters et al. (1999) used this method to classify crude oil from Eastern Indonesia, Sosrowidjojo (2011) in the Sunda-Asri Basin, and Ramadhina et al. (2017) in the NE Java Basin. In this paper, the geochemical data from various publications were subjected for oil-to-oil correlation. In addition to qualitative and quantitative geochemical analysis, HCA, a common method that have been widely used in many scientific applications (e.g., Gower et al., 1967), was also performed. The goal of HCA is not to find a single partitioning of the data, but a hierarchy (generally represented by a tree) of partitions that may reveal interesting structures in the data at multiple levels of granularity (Balcan et al., 2014). The most widely used hierarchical methods are the bottomup or the agglomerative clustering technique; most of these techniques start with a separate cluster for each point and then progressively merge the two closest clusters until only a single cluster remains (Balcan et al., 2014). To calculate the distance (d) between two points (p and q), the Euclidean algorithm is utilized and shown in Equation 1: (1) Several samples are selected for the analysis which have at least six parameters, such as Pr/Ph, S%, C27-C28-C29 steranes, and oleanane /hopane ratio. Therefore, in this study, HCA was only performed for samples that have multiple geochemical parameters. Various limitations that include the number of geochemical parameters that are not available from all samples from published work, hydrocarbon alteration, and thermal maturation are not discussed in this paper. This study assumed that such factors will not greatly affect the results of HCA. Finally, the HCA was performed using XL Stats software with a high confident level.

n-alkanes GC
The distribution of n-alkanes are indicators for organic matter input, e.g., nC27, nC29, nC31 are indicators of terrestrial origin.
e.g., Tissot and Welte (1984) Pr/Ph GC Combination of stable carbon isotope ratios of Saturates and Aromatics, a sterane ternary Carbon Isotope diagram, and other supporting data was used to classify oil groups and relate most of them to source rock extracts. Grantham et al. (1988)

GEOLOGICAL SETTINGS AND TECTONOSTRATIGRAPHY
Java island of Indonesia is located in the SE part of Sundaland in which various back-arc basins lie, including NW Java Basin and NE Java Basin ( Figure 1) studied in this paper. In nearly all the basins within Sundaland including Java and Sumatra, four stages of tectonostratigraphic evolution could be recognized (Doust and Noble, 2008 (Doust and Noble, 2008).
Sumatra and Java, suggesting rifting stage during Eocene and terminated in Oligocene, followed by post-rift since Miocene (Figure 2).
The four stages of tectonostratigraphic evolution and their relations to petroleum system elements are summarised in Table 2

ROCK FROM JAVA
The geochemistry cross plot of stable carbon isotopes from saturates fraction versus Pr/Ph (Figure 3) distinguished three different groups of source rocks namely Banuwati, Talangakar, and Ngimbang formations from Sunda-Asri, NW Java, and NE Java basins, respectively ( Figure 3). On the sterane plot, all samples have a predominance of C29 steranes suggesting input from terrestrial higher plant organic matter ( Figure 4, Table 3; Huang and Meinschein, 1979).   Bishop (2000) and Rahmad (2016), and NE Java data are from Satyana and Purwaningsih (2003) and Devi et al. (2018).  (Howes and Trisnawijaya, 1995). Table 2. Tectonostratigraphy and its relation to petroleum system elements in several basins within Java (Doust and Noble, 2008). The symbols SR, R, and C are for source rock, reservoir, and caprock or seal, respectively.  Huang and Meinschein (1979) studied the importance of steranes as indicators for the depositional environment, where a high abundance of C27, C28, and C29 steranes suggest marine, lacustrine and terrestrial environments respectively. NW Java data are from Bishop (2000) and Rahmad (2016), and NE Java data are from Satyana and Purwaningsih (2003) and Devi et al. (2018).  (Devi et al., 2018) and one from Tuban (Satyana and Purwaningsih, 2003) have higher C27 and C28 steranes indicating a predominance of marine and lacustrine organic matter, respectively.

Tectonostratigraphy in Java Basin
The Banuwati Formation is an excellent deep lacustrine Type I source rock in the Sunda-Asri Basins, with TOC of up to 8 wt% and a hydrogen index (HI) of up to 650 mg HC/g TOC (Doust and Noble, 2008;Ralanarko et al., 2020). In the Ardjuna sub-basin (NW Java Basin), the Talangakar Formation is known as the source rocks, with TOC of 40-70 wt% in coals and 0.59 wt% in the shales and HI of 200-400 mg HC/g TOC (Ponto et al., 1988), indicating that the source rock is oil-and gas-prone (Bishop, 2000). Noble et al. (1991) distinguished three facies of Talangakar Formation, which are delta plain (coal facies), delta plain (shale facies), and marine-influenced interdistributary bay. Delta plain coals have higher TOC and HI of 62.7-72.2 wt% and 348-406 mg HC/g TOC, respectively. Napitupulu et al. (1997) observed the occurrence of Botryococcane in the oils from NW Java Basin that indicate they were derived from lacustrine source rocks. In NE Java, Ngimbang Formation has TOC from 0.79-40.15 wt% and HI from 107-282 mg HC/g TOC (Devi et al., 2018).

AND NE JAVA BASINS
Oil grouping in NW Java and NE Java basins is conducted using hierarchical cluster analysis where the main parameters are sulphur content, Pr/Ph ratio, oleanane/hopane ratio, and distribution of C27-C28-C29 steranes (Tables 4 and 5). Other geochemistry plots are also used to understand the correlation between each group.
Based on those methods, six oil groups in NW Java and NE Java basins are classified (Table 4) and presented in a dendrogram ( Figure 5). Table 4. Six groups of oils are observed in this study. Groups 1, 2, and 6 are crude oils from deltaic source rocks, Groups 3 and 4 are crude oils from marine source rocks, and Groups 5 are crude oils from lacustrine and fluvio-lacustrine source rocks. Pr/Ph ratio, S (sulfur, wt%), C27-C28-C29 steranes (%), and oleanane/hopane are the parameters for conducting the HCA analysis in this study. Table 5. This study distinguished six groups of oils. Oil geochemistry in this study is compiled from NW Java Basin (Napitupulu et al., 1997) and NW Java Basin (Satyana and Purwaningsih, 2003).
Most oil groups show significant numbers of Pr/Ph ratio (higher than 1, see Table 4) and predominance of C29 sterane indicating that these crude oils were derived from source rock which was deposited under suboxic-oxic with predominance input from terrestrial organic matter. The crude oils also contain oleanane, a biomarker from angiosperm which is an indicator from Cretaceous and younger (Peters et al., 2004).
Group 1 and 2 are crude oils derived from deltaic shale facies, where the main difference is the distribution of C29 steranes and Pr/Ph ratio. Group 3 and 4 are crude oils derived from a marine shale source rock based on the distribution of C27 sterane (Huang and Meinschein, 1979; Figure 6 and Table  4) that ranges from 40-48% and 38-47%, respectively. Group 5 is oil from the lacustrine and fluvio-lacustrine source rocks based on the high distribution of C28 sterane (Huang and Meinschein, 1979; Figure 6 and Table  4) that ranges from 42-49% (Table 4). Group 6 has the same characteristics Figure 6. Sterane distribution of six oil groups in Java, Indonesia indicating depositional environment from fluviolacustrine to shallow marine (after Huang and Meinschein, 1979). Oil grouping of Java Oils shows a good correlation with C27-C28-C29 steranes. One oil from Semarang, Central Java is obtained from Praptisih (2017) and categorised as Group 6 in this study. This sterane distribution is analyzed from various publications including NW Java Basin (Napitupulu et al., 1997) and NW Java Basin (Satyana and Purwaningsih, 2003).

Lacustrine and Fluvio-Lacustrine Group
The crude oils from this group are characterized by the abundance of C28 steranes (C28> C29> C27 steranes) that are typical of lacustrine-sourced oils (McKirdy et al., 1984). The deltaic sourced oils tend to have a slight dominance of C29 over C28 and C27 steranes (Noble et al., 1991). Huang and Meinschein (1979) suggest that C28 sterane are derived from lacustrine algae. Only one group (Group 5) is categorised as lacustrine/fluviolacustrine group. Group 5 consists of five oils from NW Java Basin and one oil from NE Java Basin (Sekarkorong oil). Oils from NWJ-3 and NWJ-5 also contain Botryococcane that is observed on the chromatogram (Figure 7, Napitupulu et al., 1997). Botryococcane is a saturated, irregular isoprenoid biomarker produced by the lacustrine, colonial Chlorophycean algae Botryococcus braunii, an organism that thrives only in fresh/brackish water lacustrine environments (Peters et al., 2004). Botryococcane is also an age-related biomarker, indicating oil from Cenozoic (Peters et al., 2004). In the NE Java Basin, Botryococcane is not detected.
The gas chromatograph from NWJ oil from NW Java Basin (Figure 7) shows high concentration of nC21+ and peaks in nC29, nC31, nC33 alkanes. Group 5 also has relatively low sulphur, low oleanane/hopane, while the Pr/Pr ratios are varied from low to high indicating input from terrestrial organic matter in the fluvio-lacustrine settings (Table 4).

Deltaic-sourced Groups
Group 1, 2, and 6 are crude oils from deltaic source rocks that typically have slight dominance of C29 over C28 and C27 steranes (Noble et al., 1991). Group 1 Figure  8. Partial m/z 217 fragmentogram from crude oil retrieved from NWJ-1 (Napitupulu, et al., 1997). isoprenoids that contains Botryococcane from NWJ oils (Napitupulu et al., 1997). Botryococcane is a biomarker derived from Botryococcus braunii, colonial Chlorophycean algae that only live in the lacustrine environment (Peters et al., 2004). and 2 consists mostly of oils from NE Java Basin while Group 6 consists of oils from NW Java Basin. These groups have a predominance of C29 over C28 and C27 steranes. The considerable dissimilarities between three groups are from the number of C29 sterane and Pr/Ph ratios. Group 1 is relatively high in C29 sterane, Group 2 is less C29 sterane, and Group 6 has the highest value of Pr/Ph. The steranes plot ( Figure 6) shows that Group 1 has more terrestrial organic matter input from higher plants compared to Group 2 and 6, while Group 6 has more marine influence than Group 1 and 2 ( Figure 2) that is indicated from high C27 sterane (Figure 8). One oil from the eastern part of NW Java Basin (oil seepage from Semarang) is categorised as Group 6 ( Figure 6). From Group 6, the gas chromatograph from KE-9 shows a high peak of pristane compared to phytane ( Figure  9; Satyana and Purwaningsih, 2003). Compared to the chromatograph in the lacustrine and fluvio-lacustrine group, the deltaic group has a lower chain of n-alkanes, from nC9+ and peaks in nC14 to nC16.

Marine-sourced Groups
Groups 3 and 4 are crude oils from marine source rocks as they differ from lacustrine and deltaic source rocks from their high distribution of C27 sterane (Table 4 and 5, Figure 5) and low molecular n-alkanes distribution ( Figure 11). Groups 3 and 4 are similar in terms of the abundance of C27 sterane and typically have sterane distribution as follows: C27 > C28> C29 steranes. Group 4 differs from Group 3 in its lack of C29 sterane (Table 4), indicating oils from marine source rocks.
Groups 3 and 4 are dominated by crude oils from NW Java Basin, although one oil from NE Java Basin, Suci-B oil is categorised into Group 4 as it differs from other oils from NE Java Basin. Suci-B has the lowest composition of C29 sterane (6.70) indicating a low influence of organic matter from terrestrial higher plants (based on Huang and Meinschein, 1979), and high in C27 and C29 steranes indicating an oil from the marine source rock. Napitupulu et al. (1997) suggests that two oils, NWJ-1 and NWJ-2 are derived from marine source rocks from their low molecular weight n-alkanes (nC11 to nC17, Figure 11), low diasterane /sterane ratio, hopane/sterane > 2, C23 tricyclic terpane higher than C24 tetracyclic terpane, C35/C34 homohopane ratio > 1, a high homohopane index, C29/C30 hopane > 1 and an intermediate to high sulphur content. This study supports that NWJ-2 is from a marine source rock yet suggests that NWJ-1 is from deltaic source rock based on its sterane distribution (Figure 6), Pr/Ph ratio, and HCA analysis ( Figure 5).

ENVIRONMENT BASED ON OIL GROUPING
During early rifting, the early synrift deposit of fluvio-deltaic rocks ( Figure  12a) were deposited under terrestrial depositional environments. Several rock facies were deposited including an alluvial fan, braided fluvial channel, and lacustrine shales. The lacustrine shales are the main source rocks in this setting. This type of source rock generally types I oil-prone with high hydrogen index. In this study, Group 5 oils are derived from this lacustrine source rocks.
During late rifting, the depositional environment ranges from deltaic to marine settings, where the deltaic source rocks (Group 1, 2, 6) and marine source rocks (Group 3, 4) were deposited (Figure 12b). Group 1, 2, and 6 are generally from terrestrial organic matter where Group 6 is potentially high in carbonaceous and coaly material. Group 3 and 4 are oils from marine source rocks that are potentially generated from siliciclastic source rocks. During early postrift, deltaic and marine rocks are also potential source rocks (Figure 12c), especially in the NE Java Basin, however, the late postrift are generally non-source rock potential (Figure 12d).

CONCLUSIONS
This study reveals different groups of potential source rocks in NW Java and NE Java basins based on various geochemical parameters including Pr/Ph and δ 13 C of saturate fraction (Figure 3). Banuwati Formation differs from Talangakar Formation from its high δ 13 C of saturate. Oil extracts from Tuban Formation are relatively similar to Talangakar Formation based on its Pr/Ph and δ 13 C of saturate. Ngimbang and Kujung formations relatively have Pr/Ph less than 4 and δ 13 C of saturate less than -26‰ (Figure 3). The agglomerative-hierarchical cluster analysis using the Euclidean algorithm is utilised in this study to classify various oils NW Java and NE Java oils, where several geochemical parameters are performed such as sulfur content, Pr/Ph, C27-C28-C29 steranes, and oleanane/hopane. Six oil groups are determined from this analysis. Group 1, 2, and 6 are oils from deltaic source rocks, with a significant number of C29 sterane. These groups differ from each other in their C29 sterane concentration, where Group 1 has the highest C29 steranes, followed by Group 2 and Group 6. Group 5 is oil from lacustrine and/or fluvio-lacustrine source rock where C28 steranes are more dominant than C27 and C29 steranes. Some crude oils from NW Java in this group (NWJ-3 and NWJ-5) also contain Botryococcane (Napitupulu et al., 1997), an isoprenoid biomarker formed from precursors in Botryococcus braunii that only live in fresh/brackish water lacustrine environment (Peters et al., 2004). Finally, Group 3 and 4 are oils from marine source rocks where they are dominated by oils from NW Java oils, and only one oil from NE Java (Suci-B) that has very low C29 steranes and high C27 steranes.
The schematic depositional environment of source rocks is also described in this study ( Figure 10). Group 5 is deposited under a nonmarine lacustrine early synrift environment, while the remaining groups are deposited in the deltaic (Group 1, 2, and 6) and marine settings (Group 3 and 4). The main differences between those groups are including the distributions of C27-C28-C29 steranes.