Introduction

In urban areas with separate storm sewer systems, domestic wastewater is designed to be kept separate from storm drainage, preventing direct discharge into urban water bodies during rainfall. However, due to improper management, such as connection errors, inadequate maintenance, or pipe deterioration, stormwater and/or groundwater can infiltrate into domestic sewer systems, or, conversely, domestic wastewater can leak into urban water areas, including subsurface water systems (US EPA 1996; Wolf et al. 2012; Ozaki et al. 2019). In the worst cases, pipes can collapse, resulting in road cave-ins, which can result in severe traffic accidents or even human injuries. Aside from these extreme situations, the leakage of domestic wastewater into water bodies can cause water pollution and pose public health risks due to the presence of pathogens. In many urban areas, sewer pipes have been in use for a long time since their initial development. Moreover, even in the case of separate storm sewer systems, more than 50 years have passed since their construction in developed countries such as Japan. Therefore, the deterioration and renewal of such infrastructures have become important issues for these countries today.

Leaks from sewer pipelines can pollute urban subsurface and surface water through storm discharges. Storm pipelines, often older than sewer pipelines, act as key conduits for pollutant transfer. Diagnosing sewage leakage is challenging due to subsurface concealment. Currently, only point or line (one-dimensional) probing methods are available to assess the at least two-dimensional distribution of leaks beneath the subsurface layers. Several diagnostic methods have been developed to address this issue, including direct visual observation using cameras, spot leakage measurements, sewer condition monitoring, tracer spiking testing, and others (Panasiuk et al. 2015; DeSilva et al. 2005; Rutsch 2008). However, estimates of leak ratios have sometimes varied by orders of magnitude (Ellis 2009; Wolf 2006), reflecting possibly spatial and temporal variations in the subsurface layers. Studies confirming such influences are limited, and the methodologies for investigating them are not well-developed (Ellis et al. 2013; Spahr et al. 2020). Therefore, additional research is necessary to gain a better understanding of these spatial and temporal variations and to develop more advanced methodologies for accurately assessing sewer pipeline leaks.

In this study, pollution levels due to domestic chemicals were measured in flowing or standing waters from stormwater drainage pipelines to investigate possible leakage from sewer pipelines into stormwater drainage channels in urban areas. Rainwater, puddle water and domestic sewage were also collected for reference purposes. Firstly, stormwater drains were surveyed in six different urban catchments with completed domestic drainage networks of various ages, ranging from 10 to more than 40 years old, to determine how age affects pollution. In the next stage, in the area where the highest pollution was detected, standing waters from different storm pitches were sampled and their pollution levels were measured. From the obtained results, possible contributions and relationships between loading sources and domestic sewer pipelines were discussed. The chemical species measured included five fragrance compounds commonly used in soaps, detergents and cosmetics, OTNE, HHCB, AHTN, DPMI and musk ketone. Caffeine, which is often found in coffee and various other beverages, and benzophenone, which is widely used in sunscreens, were also covered. These compounds were consistently detected in domestic wastewater. In addition, three polycyclic aromatic hydrocarbons (phenanthrene, fluoranthene and pyrene), commonly used as indicators of air and urban surface pollution, were also measured. Based on the data collected, the potential impacts of the domestic sewage pipeline were discussed.

Experimental

Description of the storm drainage areas

Water in storm drains was collected during dry periods at the downstream end points of the six urban drainage areas. The watershed areas for the points were clearly defined from maps of storm drain networks with slope information (referred to as District A to District F; Table 1 and Fig. S1). For Districts A, B, and C, stagnant water was collected from the gutters at the points. For Districts D, E, and F, since a large box culvert was installed to discharge surface rainwater for each district, and the water always flowed even during periods of no rainfall, albeit not very much, these flowing waters were collected. They are operated to prevent internal flooding, and D and E are equipped with pumps to discharge rainwater to the coastal area. In the following, the water sampled from the gutters or the box culvert is referred to as "baseflow water" of storm drain. Districts A, B, C and F were located in Higashi-Hiroshima City, which is located 20 km east of Hiroshima City, southwestern Japan, and Districts D and E were located in downtown area in Hiroshima City. The districts are considerably small (1.2–6.4 ha) for A, B, and C, and larger for D, E, and F (103–478 ha). All districts have implemented separate storm sewer systems and the sewer coverage rates are almost 100% (A to D have 100%, E and F has 97 and 99%, respectively). Districts A, B, and C were developed as a single-family residential area by a collaboration of the city bureaus and private housemakers, and the domestic waters were designed to remove by the sewer systems from the first step of the development. Sewer pipelines were developed in 2010 for A, 1993–94 for B, and 1990 for C. District D has 100% of sewerage coverage rate and the year of the develop of sewer pipeline was date back to 1974. District E has higher than 97% and the year of the develop of sewer pipeline was date back to 1963. District F is an old, historical downtown area in the city, and the domestic sewer pipelines were firstly developed in 1985. Though the pipeline network has completed, about ten households have yet connected to sewer pipes, and the greywaters were directly discharged to the storm sewers. Unlike Districts A, B, and C, Districts D, E, and F were historically developed as a downtown or residential area predating the implementation of domestic sewers and a mix of commercial and residential areas.

Table 1 Characteristics of stormwater catchment districts

For District F, some of the sampling and analysis results have been published previously (Ozaki et al. 2023). In the study, the existence of some erroneous or irregular situations concerning with domestic sewer networks have already found and reported; one is that several households discharge greywater into storm drain networks, not to connect with domestic sewers. They only have an individual septic treatment plant just to treat toiletry wastewater for each, and the untreated greywater is directly discharged into storm drain; the other is an unexpected discharge to a storm drainage, which has thereafter been speculated to be also domestic greywater. The baseflow water discharged from this district contained the fragrance substances stably, and was referred as a positive control for this study. Domestic sewage, rainwater, and urban puddle water were also collected in the study. Their results are also shown in this report as a reference.

Sampling

Sampling was conducted from June 2022 to May 2023; the water was collected once or twice a month, and totally five to eleven times for each. The samplings were conducted on dry days, 11:00–18:00, and at least a day elapsed since the last rainfall ended. Each sampling was conducted during separate intervals between rainfalls. Aside from storm drain samplings, rainwaters, puddle waters, and domestic sewage was collected, for which the details were shown in our previous study, shown in this study as a reference (Ozaki et al. 2023). Briefly, the rainwaters were collected in Hiroshima University campus. They were collected with a stainless tray and a bottle. They were set at the instance noticing the rainfall and consequently, the collected rainwaters do not necessarily contain first-flush. The puddle waters were collected at a point in a roadside in District F. The domestic sewages were collected nine times directly from a sewer pipe at a point in District F.

Analytical procedure

For analysis, about 500 mL of water samples were collected (100 mL for domestic wastewater), and, the suspended solid was preliminary removed by the filtration by a glass fiber. Hence, all the targeted substances are in free and dissolved forms, the filtered water was applied to solid phase extraction to entrap targeted micropollutants. These processes were performed in the same day of a sampling, and the extraction was started within six hours from the sampling.

The filtered sample water was percolated through a silica column cartridge to trap targeted compounds. The column was subsequently eluted with 10 mL of dichloromethane, and the eluate was concentrated into 100 µL with a gentle stream of nitrogen gas. The compounds’ concentration was quantified by using a gas chromatograph equipped with a mass spectrometer operated in the single-ion monitoring mode. Targeted compounds were classed to domestic use chemicals and polycyclic aromatic hydrocarbons (PAHs). For domestic use chemicals five fragrance compounds were targeted, such as OTNE (1,2,3,4,5,6,7,8-octahydro-2,3,8,8-tetramethylnaphthalene-2yl]ethan-1-one), HHCB (1,3,4,6,7,8-hexahydro-4,6,6,7,8,8-hexamethylcyclopenta[g]-2-benzopyran), AHTN (1-[5,6,7,8-tetrahydro-3,5,5,6,8,8-hexamethyl-2-naphthalenyl]-ethanone), DPMI (6,7-dihydro-1,1,2,3,3-pentamethyl-4(5H)-indanone), and musk ketone; caffeine and benzophenone were also targeted. For PAHs, three species were targeted, such as phenanthrene, fluoranthene, and pyrene. Detailed method is provided in Text S1 and the chemical characteristics of targeted compounds is provided in Table S1.

Results and discussion

Comparison of different districts

The measured data was summarized in Table S2. The detection frequency was higher than 70% for three PAHs, OTNE, HHCB, AHTN, and benzophenone, followed by caffeine with 64%, which exceeds more than half, while DPMI and musk ketone are less than half for the total number of samples, detection being defined as greater than 0.2 ng L−1 for each PAH and greater than 2 ng L−1 for others. Some representative detection records by mass spectrometry are shown in Fig. S2. The detection levels found in the baseflow waters of Districts A to F were 0.6–4.7 ng L−1 for each of the three PAHs, 3–9 ng L−1 for OTNE, HHCB, and AHTN, 28 ng L−1 for caffeine, and 10 ng L−1 for benzophenone (median values). OTNE, HHCB, and AHTN are hereafter collectively referred to as "three musks". Note that OTNE is not necessarily classified as a musk.

For the domestic sewage, detection frequencies were almost 100% for all. The concentration level was in the range of single digits to hundreds of ng L−1 for three PAHs, tens to ten thousand for fragrance substances, caffeine, and benzophenone. Fragrance substances were widely measured for domestic sewers worldwide in the latest twenty years, and for instance, Montes-Grajales et al. (2017) summarized the measured data in their review paper. According to their review, HHCB and AHTN were in the class of most frequently measured substances, and their measured concentration was several hundred up to ten thousand ng L−1, following DPMI to a ten to a thousand ng L−1. These were comparable to our results. On the other hand, musk ketone concentrations were considerably lower, less than in the range of tens to several hundred ng L−1, compared to our results of several hundred to tens of thousands ng L−1. Yet the observed differences were not large, considering the two to three order of magnitude variations typically seen for micropollutants of water areas. One possible explanation could be the divergent usage trends of this ingredient in personal care products among different countries. While musk ketone itself is not prohibited, many other nitro musk compounds have been banned in many countries including Japan (The International Fragrance Association 2023; European Commission 2011; Maekawa et al. 1990). Additionally, it is possible that the fragrance industry has voluntarily refrained from using musk ketone, and the extent of this avoidance may vary across companies and countries. Here in Japan, there are currently no voluntary industry efforts to regulate musk ketone. For benzophenone, Ekpeghere et al. (2016) measured influent concentrations for nine sewage treatment plants, which received only domestic wastewater. The reported concentrations ranged from not detected to several thousand ng L−1, with median values exceeding one thousand ng L−1. The concentrations observed by Ekpeghere et al. (2016) were considerably higher than ours, possibly reflecting differences in the lifestyles of residents and/or the industrial inputs received by the wastewater treatment plants they studied. In summary, our study was generally consistent with previous research, with some minor exceptions, in terms of detection frequencies and concentration levels in domestic wastewater, particularly for fragrance substances and benzophenone.

For a baseline loading reference, rainwater, and additionally puddle water concentration were measured. The detection frequency for PAHs was 80% at the lowest, while for each of three musks, it ranged from 60%–82%. Furthermore, the detection frequency for benzophenone was high, i.e., 100% for rainwater and 90% for puddle water. On the other hand, the detections of caffeine and DPMI were lower, being less than half at the highest. These detected loadings may primarily relate to atmospheric pollution. Concerning PAHs, extensive reports have been published on atmospheric pollution and atmospheric fallout. For example, Grynkiewicz et al. (2002) and Manoli et al. (2002) reported bulk precipitation samples in urban areas and compared them with other reports mainly from European countries and the US, showing similar levels ranging from several to several tens of ng L−1 for low or moderate molecular weight components such as phenanthrene, fluoranthene, or pyrene. On the other hand, there was limited information regarding fragrance substances or other household chemical substances. In our previous study, we reported the rain concentration for fragrance substances and to the authors’ best knowledge one report showed similar concentrations to ours for HHCB and AHTN measured in Netherland (Peters et al. 2008). Currently, the atmospheric pollution pathway of these substances is not as clear as that of PAHs, which are known to be emitted directly into the air from sources such as vehicles, heating, and industrial processes. However, as fragrance substances, their tendency to volatilize into the air phase, as indicated by factors such as Henry’s constant, being higher than that of PAHs with similar molecular weights. Therefore, these high concentrations in rainwater are likely related to their volatile nature.

Regarding the lower detection of DPMI and musk ketone, the reasons could be diverse. While the Henry's constant of DPMI is comparable to that of OTNE, HHCB, and AHTN, the Henry's constant of musk ketone is substantially lower, i.e., in two orders of magnitude. This would suppress the atmospheric pathway. Additionally, since the levels in domestic sewer pollution levels were lower, i.e., musk ketone and DPMI concentrations were lower than those of three musks by more than an order of magnitude, atmospheric concentrations would also become lower, accordingly.

Relative concentration ratio to domestic sewage

To compare baseflow waters for different districts, the relative concentration ratios to domestic sewage for the median values for each point and stage were calculated (Fig. 1; musk ketone was omitted due to low detection frequency; the whole concentration distributions were shown in Fig. S3). The trend was classified into three; the first was for three PAHs and benzophenone, which is comparably flat from domestic sewage to rainwater, the second was three musks, which varies in the three orders of magnitude, with background levels being found for the rainwater level, having two to three orders of magnitude lower than the domestic sewage. The third was caffeine, for which the background level for rainwater could not be calculated because it was below the detection limit, in other words, the ratio was less than minus three orders of magnitude.

Fig. 1
figure 1

Relative concentration ratios to domestic sewage of phenanthrene, fluoranthene, pyrene, musks, benzophenone, and caffeine, based on median values, for A-F districts water, rain and puddle water. The ratio was most often highest for domestic sewage. On the other hand, the drop-down ratios differed considerably for different groups of chemicals. While the ratios for polycyclic aromatic hydrocarbons (PAHs) and benzophenone were low, those for musk and caffeine were significantly lower

For the first class, it can be said that the level of pollution is the same, even including domestic sewage. For such chemical compounds, the contribution of rainwater and domestic sewage to the pollution of baseflow water cannot be clearly distinguished. It is true that domestic sewage was the highest for all species. Yet the difference was only in an order of magnitude. Moreover, interestingly, the level of the contamination of the baseflow waters was lower even compared to those of rainwater for the PAHs, which tendency was common to our previous study. This was possibly due to adsorption to soil layers through infiltration of surface or subsurface areas. For three musks, the level of contamination was two or more orders of magnitude lower than that of domestic sewage. For OTNE and HHCB, the level was little higher in the D to F, but not by as much. For the third class, caffeine, the level of contamination was distinctly different for the rainwater, storm drainages, and domestic sewage. The caffeine level in rainwater was below ng L−1 in most cases if present. The Henry's constant of caffeine is extremely low, on the order of 10−6 Pa m3 mol−1, which is on the order of parts per million compared to OTNE, HHCB, and AHTN, or on the order of parts per hundredth compared to PAHs (Wong et al. 2019). Therefore, even if the rainfall fallout of OTNE, HHCB, and AHTN is a recirculation from the once volatized from water bodies, the same effect would be very limited for caffeine, i.e., the rainfall input would be negligible, less than one thousandth at most. In these cases, it would not be unreasonable to attribute the source of such contamination to domestic sewage. The relative concentrations for A were negligible, less than 1/103, for B, C, and F were on the order of less than one hundredth, and for D and E more than one hundredth, possibly reflecting the different influence of domestic sewage. In summary, distinct differences of relative concentrations were observed among substance classes, possibly reflecting differences in their origin of the pollution.

Relation of characteristics of different watershed areas

We measured for six different districts. In terms of sewer network development, District A is the most recent developed, followed by B and C, and D to F. Another difference was the situation of development history. Districts A, B, and C were first developed as single-family residential areas by municipalities with private developers. On the other hand, Districts D, E, and F were historically established in pre-modern times for the oldest, and the sewerage networks were built in the existing urban areas in a long from the 1970s to the oldest to the present. Especially for District F, the historic old town of the city date back to some medieval, and old-age sewerage development was already drawn in the long history. Moreover, while F is in the very center area of Higashi-hiroshima city, well-developed middle population city now, the history as a city was not very long (1974 to present) and well-planned development by the city, too.

The authors have previously found a possible misconnection of domestic sewers of a vestige of old possible domestic sewer pipeline, but which was not accurately recognized by the Municipality. In addition, while the public domestic sewage network in this district has been completed, the connection of households has not yet been completed. About ten households were not connected to the sewerage system and were directly discharging greywater. Overall, as in our previous research, significant domestic wastewater discharges were suspected in this neighborhood and may be a positive case of sewerage inclusion in stormwater drainage.

Above mentioned, District A is considered to be most “clean” in terms of sewer leaks, and F to be positive control which has distinct domestic sewer emissions. Looking at the relative concentration ratios with this view, the pollution level was the least for three musks and caffeine for A. Adversely, stable higher pollution level was observed for F than A, indicating these chemical species being expected a good marker of domestic sewer emissions. For District A, the concentration was the interestingly lowest, and lower than the rainwater for three musks, which was not seen even for other designed household districts (B and C). The possible reason for the lowest is the adsorption to the soil or adversely, volatilization. However, consistent reason could not be presumed to explain the lower PAHs concentrations than the rainfall commonly to A to F, and not lower for B and C for three musks in once. Further research is needed for this.

For Districts B to E, B and C were lower and D and E were higher for three musks and caffeine. It was suggested this higher contamination for D and E was attributed to the sewer networks having been developed after the town development, suggesting the chance of the leakage not only vary with the length from the year of development but the situation at the time of construction. Still, the fact that the level was higher for B and C than A would suggest the significance of the age after the construction.

Comparing to three musks to caffeine, the difference is clearer for caffeine. The possible reason is the different influence of the rainwater as a background. Another possible reason is the partition characteristics between soil, water, and air. As mentioned above, both the Henry’s constant and Kow is the lowest for caffeine. This means the caffeine is most stable in water phase among the species. If other effects like physico-chemical or biological degradation being negligible, caffeine would possibly be the most synchronized tracer of the transference of water itself.

Further, comparing to District F, the level was comparable to D, E, and F for three musks but substantially lower for F for caffeine, down similar to that of B and C. The reason was not clarified in this survey. If explained by the source difference, the presumed source for F was limited because the explicit source was only from a few of household. The source pattern from a limited number of households can be different from the ordinary pattern represented by domestic sewage results. Another possible difference is this emission is only greywater, not included black water. Though not having the information of black water caffeine concentration generally, the caffeine concentration in urine is known to reach in the order of several hundred µg L−1 (Rybak et al. 2015). This implies a major source of domestic sewer caffeine is black water, and the lack of black water can explain the comparably lower level for F. Further consideration is needed for the next step.

Concentration pattern for the points in the polluted district D

The concentration pattern was investigated for stormwater ditches in several different points (D1, D2, D3, and D4; Fig. S4) in District D, in which the concentration was found to be stably high despite no known emission domestic sewers. The points D1 to D3 are along the old main street of this district with sewer pipelines developed in 1981 to 82. D4 is on another main street, with sewer pipelines developed in 1979. For each point, the samples were collected nine or ten times (the whole distributions were shown in Fig. S5). Overall, the level was basically similar each other and to the final storm discharge (Fig. 2a). The level was a bit lower for point D4, but not substantially. Seeing the caffeine concentration, the mean level was not distinctly different for each point like other substances, but the variation substantially differed from point to point, which was not observed for other substances. To compared the variation, the geometric standard deviation taken for each species and points (Fig. 2b). The variation was two- to four-folds with an exception of caffeine, having up to 16 times difference at the maximum. If this high-variation for caffeine is due to the leakage, this implies it would be strong time depended phenomena. For these four points, the level was the lowest for the point D4 among them. But still, the level was still higher if compared to other relatively unpolluted districts. For comparing the pollution level comprehensively, cumulative probability plots for each place were (Fig. 3) compared to those for clean areas (Districts A, B, and C) for caffeine. The concentration was observed to be substantially higher even for the point D4 comparing with the clean Districts A, B, and C. This implies the emission brings about more or less ubiquitously in the district.

Fig. 2
figure 2

Geometric mean (a) and standard deviation (b) for the different points in the polluted district, district D (D1 to D4) and the pump station in District D. Note that the standard deviation for caffeine was high, more than eight times, for points D1 and D2

Fig. 3
figure 3

Comparison of the different points in the polluted district, District D (D1 to D4) to clean discharges (Districts A to C) by caffeine concentration by cumulative probability plot. Though the lowest level, on the order of 10 ng L−1, was similar for all the points to clean discharges, higher level was substantially different for the points

The results of this study are very limited, and a comprehensive investigation is needed to determine the true hidden sources. Yet, even from this limited survey, many suggestions can be made. First, even at relatively low-level detection points (e.g., D4) in the district, both musks and caffeine were higher than in other less polluted areas. This suggests that even if the source of these substances is domestic wastewater, they do not emit from limited hot spots in this area. This is a significant difference compared to District F, where our previous study also found high discharge due to old sewage pipes. In District F, the outflow was from a limited number of hotspots, i.e., from misconnection or unconnected households to sewer networks, whereas in District D, the outflow was not such an occurrence, suggesting that it was spread comparably evenly spatially.

On the other hand, in caffeine, we found a high-concentration spot. This possible hotspot was not always high, suggesting an intense timal variation. Another important feature is that musk compounds do not show a similar trend in this respect. The reasons for the differences could be different sources or differences in the physicochemical characteristics of the substances. Unfortunately, the authors do not have sufficient insight into possible source differences. Earlier, we indicated that the low caffeine level in District F could be due to the fact that the source of the leak is not toiletry discharge. If this is the case, it would mean that toiletry is a specific source at this site, but there found no positive evidence for this. On the other hand, we consider that differences in solid–liquid partitioning may be another important factor for the physicochemical differences.

The difference between caffeine and musk is interesting in terms of source and kinetics, and clarifying the cause of this difference will allow for a deeper understanding of the dynamics of domestic wastewater. This will be an important subject of future studies.

Conclusion

In this study, the concentrations of the domestic use chemicals were measured for the discharges and residues of sewer pipelines during the no-rainfall times in various different urban drainage areas with the separate storm sewer system. The domestic use chemicals concentration tended to be higher for older developed area, suggesting the erroneous situations for the management of domestic sewer pipelines. Still, the even similarly older districts, the possible leakage concentrations are diverse with different chemical species. For two old developed areas (Districts D and E), high concentration was observed among other areas for caffeine, and adversary, lower concentration for caffeine but higher concentration for musks were observed for another old area (District F). These differences were considered to reflect the different history of the development and the maintenance, and further, possibly different physico-chemical properties of different chemical species. Our data suggest that different leakage situation can be segmented by the comparison of musk compounds and caffeine. For the next step, it is important to clarify the component pattern difference between greywaters and blackwaters, and physico-chemical behavior and interaction analysis of the substances in subsurface soil, water, and air phases is required. From these analyses, the various different leakage situations of sewer pipelines can be speculated from the observation of urban catchment water areas.