OSPAR 2025 PFAS investigation
Here we investigate the various ways PFAS compounds have been submitted and identify the dominant compounds based on their relative potency factors (RPF). The investigation is based on extractions from the ICES database on 10 January 2025.
The way in which PFAS will be assessed in the 2025 CEMP Assessment is based on this investigation. However, it will be reviewed before the 2026 CEMP Assessment as some contracting parties have additional PFAS data that are not presently in the ICES database.
Here are all the ICES codes, desciptions and CAS numbers for compounds in the O-FL (organofluorines) pargroup. Some of the longer code descriptions have been abbreviated to make everything fit.
Important: PFOSA has been deprecated and ‘replaced’ by PFOSD. This has (and will) cause lots of confusion.
PFNOA has also been deprecated and replaced by PFNA.
wk <- read.csv(
file.path("extra_resources", "param_CAS.tab"),
sep = "\t",
na.strings = ""
)
names(wk) <- names(wk) |>
tolower() |>
sub(pattern = ".", replacement = "_", fixed = TRUE) |>
sub(pattern = ".", replacement = "", fixed = TRUE)
wk <- dplyr::mutate(
wk,
cas = dplyr::if_else(cas %in% c("IS", "NV"), NA_character_, cas)
)
cas_data <- wk
wk <- read.csv(
file.path("extra_resources", "param_pargroup.csv"),
na.strings = ""
)
# columns misnamed
wk <- dplyr::rename(
wk,
pargroup = "PARAM",
param = "Pargroup",
pargroup_desc = "PARAM_desc",
param_desc = "Pargroup_desc"
)
wk <- wk |>
dplyr::filter(pargroup %in% "O-FL") |>
dplyr::select(param, param_desc)
wk <- dplyr::left_join(wk, cas_data[c("param", "cas")], by = "param")
wk <- dplyr::arrange(wk, toupper(param))
param_data <- wk
wk <- dplyr::mutate(
param_data,
param_desc = dplyr::if_else(
param == "DFNSA",
sub("Nonanesulfonic acid", "Nonaneslfnc acd", param_desc),
param_desc
),
param_desc = dplyr::if_else(
param == "DFNSA",
sub("nonadecafluoro", "nonadecaflr", param_desc),
param_desc
),
param_desc = dplyr::if_else(
param == "ADONA",
sub("(trifluoromethoxy)propoxy", "(triflrmthxy)prpxy", param_desc, fixed = TRUE),
param_desc
),
param_desc = sub(
"Sum of total oxidisable precursors of",
"Sum total oxidisable precursors",
param_desc
),
param_desc = dplyr::if_else(
param == "PFNOA",
paste(param_desc, "(deprecated)"),
param_desc
)
)
print(wk, row.names = FALSE)
param param_desc cas
ADONA 2,2,3-Trifluoro-3-(1,1,2,2,3,3-hexafluoro-3-(triflrmthxy)prpxy)propanoic acid <NA>
AMP3PFN Ammonium 4,8-dioxa-3H-perfluorononanoate <NA>
br-PFDS Perfluorodecanesulfonic acid - sum of all isomers with branched form <NA>
br-PFHXS Perfluorohexanesulfonic acid - sum of all isomers with branched form <NA>
br-PFOS Perfluorooctanyl sulphonic acid - sum of all isomers with branched form <NA>
br-PFOSD Perfluorooctanesulfonamide - sum of all isomers with branched form <NA>
CL82PFAES Potassium 11-chloroeicosafluoro-3-oxaundecane-1-sulfonate <NA>
CLPFESA Potassium 9-chlorohexadecafluoro-3-oxanonane-1-sulfonate <NA>
DFNSA 1-Nonaneslfnc acd, 1,1,2,2,3,3,4,4,5,5,6,6,7,7,8,8,9,9,9-nonadecaflr-, ion(1-) <NA>
FHxSA Perfluorohexanesulfonamide <NA>
FTUCA 2H-Perfluoro-2-decenoic acid <NA>
HFPO-DA Hexafluoropropylene oxide dimer acid 13252-13-6
n-PFBSA Perfluorobutanesulfonic acid - sum of all isomers with linear form <NA>
n-PFDS Perfluorodecanesulfonic acid - sum of all isomers with linear form <NA>
n-PFHpS Perfluoroheptanesulfonic acid - sum of all isomers with linear form <NA>
n-PFHXS Perfluorohexanesulfonic acid - sum of all isomers with linear form <NA>
n-PFOS Perfluorooctanyl sulphonic acid - sum of all isomers with linear form <NA>
n-PFOSD Perfluorooctanesulfonamide - sum of all isomers with linear form <NA>
NETFOSAA N-Ethylperfluorooctanesulfonamidoacetate 2991-50-6
NFBSF 1,1,2,2,3,3,4,4,4-nonafluorobutane-1-sulphonyl fluoride <NA>
NFHxSA 3,3,4,4,5,5,6,6,6-Nonafluoro-1-hexanesulfonic acid 757124-72-4
NMEFOSAA 2-(N-Methylperfluorooctanesulfonamido)acetic acid <NA>
OPFECHS Perfluoro-1-ethylcyclohexane sulfonate (potassium salt) <NA>
PERFBS Perfluorobutanesulfonate <NA>
PFBA Perfluorobutyric acid 375-22-4
PFBS Potassium perfluorobutanesulfonate 29420-49-3
PFBSA Perfluorobutanesulfonic acid 375-73-5
PFDA Perfluorodecanoic acid 335-76-2
PFDoA Perfluorododecanoic acid <NA>
PFDS Perfluorodecanesulfonic acid - branched + linear form 335-77-3
PFDSA 1H,1H,2H,2H-Perfluorodecanesulfonic acid 39108-34-4
PFDSI Perfluorodecanesulfonate anion 126105-34-8
PFDSK Potassium Perfluorodecanesulfonate <NA>
PFECHS Perfluor-4-ethylcyclohexan sulfonate (potassium salt) <NA>
PFHPA Perfluoroheptanoic acid <NA>
PFHpS Perfluoroheptanesulfonic acid 375-92-8
PFHXA Perfluorohexanoic acid <NA>
PFHxDA Perfluorohexadecanoic acid 67905-19-5
PFHXS Perfluorohexanesulfonic acid - branched + linear form 355-46-4
PFHxSI Perfluorohexanesulfonate anion <NA>
PFHxSK Potassium Perfluorohexansulfonate <NA>
PFNA Perfluorononanoic acid 375-95-1
PFNOA Perfluorononanoic acid (deprecated) 375-95-1
PFNS Perfluorononanesulfonic acid 68259-12-1
PFOA Perfluorooctanoic acid 335-67-1
PFOcDA Perfluorooctadecanoic acid 16517-11-6
PFOS Perfluorooctanyl sulphonic acid - branched + linear forms 1763-23-1
PFOSA Perfluorooctylsulfonate acid amide (deprecated) <NA>
PFOSD Perfluorooctanesulfonamide - branched + linear forms 754-91-6
PFOSF18 Perfluorooctanesulfonyl fluoride 307-35-7
PFPeA Perfluoropentanoic acid 2706-90-3
PFPEDA Perfluoropentadecanoic acid 141074-63-7
PFPeS Perfluoropentanesulfonic acid 2706-91-4
PFTDA Perfluorotetradecanoic acid 376-06-7
PFTrDA Perfluorotridecanoic acid 72629-94-8
PFUnda Perfluoroundecanoic acid <NA>
TFA Trifluoroacetic acid 76-05-1
THPFOS 1H,1H,2H,2H-perfluorooctane sulfonate 27619-97-2
TOP-PFBA Sum total oxidisable precursors perfluorobutanoic acid (all isomers) <NA>
TOP-PFBSA Sum total oxidisable precursors perfluorobutanesulfonic acid (all isomers) <NA>
TOP-PFDA Sum total oxidisable precursors perfluorodecanoic acid (all isomers) <NA>
TOP-PFDoA Sum total oxidisable precursors perfluorododecanoic acid (all isomers) <NA>
TOP-PFDS Sum total oxidisable precursors perfluorodecanesulfonic acid (all isomers) <NA>
TOP-PFHPA Sum total oxidisable precursors perfluoroheptanoic acid (all isomers) <NA>
TOP-PFHXA Sum total oxidisable precursors perfluorohexanoic acid (all isomers) <NA>
TOP-PFHxDA Sum total oxidisable precursors perfluorohexadecanoic acid (all isomers) <NA>
TOP-PFHXS Sum total oxidisable precursors perfluorohexanesulfonic acid (all isomers) <NA>
TOP-PFNA Sum total oxidisable precursors perfluorononanoic acid (all isomers) <NA>
TOP-PFOA Sum total oxidisable precursors perfluorooctanoic acid (all isomers) <NA>
TOP-PFOcDA Sum total oxidisable precursors perfluorooctadecanoic acid (all isomers) <NA>
TOP-PFOS Sum total oxidisable precursors perfluorooctanesulfonic acid (all isomers) <NA>
TOP-PFPeA Sum total oxidisable precursors perfluoropentanoic acid (all isomers) <NA>
TOP-PFTeDA Sum total oxidisable precursors perfluorotetradecanoic acid (all isomers) <NA>
TOP-PFTrDA Sum total oxidisable precursors perfluorotridecanoic acid (all isomers) <NA>
TOP-PFUnda Sum total oxidisable precursors perfluoroundecanoic acid (all isomers) <NA>
And here, perhaps, are the codes relevant to MIME submissions and assessment. The columns are:
The first 24 rows are for the 24 PFAS compounds (referred to as the PFAS 24) on which the proposed PFAS EQS will be based. The last row is for the precursor PFOSA (standard usage) or PFOSD (ICES code).
Note the likely confusion with acronyms PFNA (deprecated code PFNOA), PFBS (ICES code PFBSA) and PFOSA (ICES code PFOSD, deprecated code PFOSA).
Also, three of the PFAS 24 do not have ICES codes.
MIME_codes <- read.csv(
file.path("extra_resources", "pfas_compounds v2.csv"),
na.strings = ""
)
names(MIME_codes) <- tolower(names(MIME_codes))
# checks
stopifnot(
na.omit(MIME_codes$param) %in% param_data$param,
na.omit(MIME_codes$linear) %in% param_data$param,
na.omit(MIME_codes$branched) %in% param_data$param,
na.omit(MIME_codes$sum_precursor) %in% param_data$param
)
wk <- sapply(names(MIME_codes), function(x) "", simplify = FALSE)
wk$rpf <- NA_real_
tidyr::replace_na(MIME_codes, wk)
acronym cas param linear branched related deprecated rpf sum_precursor
1 PFBA 375-22-4 PFBA 0.0500 TOP-PFBA
2 PFPeA 2706-90-3 PFPeA 0.0300 TOP-PFPeA
3 PFHxA 307-24-4 PFHXA 0.0100 TOP-PFHXA
4 PFHpA 375-85-9 PFHPA 0.5050 TOP-PFHPA
5 PFOA 335-67-1 PFOA 1.0000 TOP-PFOA
6 PFNA 375-95-1 PFNA PFNOA 10.0000 TOP-PFNA
7 PFDA 335-76-2 PFDA 7.0000 TOP-PFDA
8 PFUn(D)A 2058-94-8 PFUnda 4.0000 TOP-PFUnda
9 PFDo(D)A 307-55-1 PFDoA 3.0000 TOP-PFDoA
10 PFTrDA 72629-94-8 PFTrDA 1.6500 TOP-PFTrDA
11 PFTeDA 376-06-7 PFTDA 0.3000 TOP-PFTeDA
12 PFHxDA 67905-19-5 PFHxDA 0.0200 TOP-PFHxDA
13 PFODA 16517-11-6 PFOcDA 0.0200
14 PFBS 375-73-5 PFBSA n-PFBSA 0.0010 TOP-PFBSA
15 PFPeS 2706-91-4 PFPeS 0.3005
16 PFHxS 355-46-4 PFHXS n-PFHXS br-PFHXS PFHxSI~PFHxSK 0.6000 TOP-PFHXS
17 PFHpS 375-92-8 PFHpS n-PFHpS 1.3000
18 PFOS 1763-23-1 PFOS n-PFOS br-PFOS 2.0000 TOP-PFOS
19 PFDS 335-77-3 PFDS n-PFDS br-PFDS 2.0000 TOP-PFDS
20 6:2 FTOH 647-42-7 0.0200
21 8:2 FTOH 678-39-7 0.0400
22 HFPO-DA 62037-80-3 HFPO-DA 0.0600
23 ADONA 958445-44-8 ADONA 0.0300
24 C6O4 1190931-41-9 0.0600
25 PFOSA 754-91-6 PFOSD PFOSA NA
Here are the number of PFAS measurements, as submitted by each contracting party.
country
determinand Denmark France Germany Ireland Norway Sweden The Netherlands United Kingdom
br-PFDS . . . . . 111 . .
br-PFHXS . . . . . 111 . 68
br-PFOS . . . . . 111 . 135
br-PFOSD . . . . . 111 . .
HFPO-DA 2 . . . . . . .
n-PFDS . . . . . 111 . .
n-PFHXS . . . . . 111 . .
n-PFOS . 20 75 . . 111 . 296
n-PFOSD . . . . . 111 . .
PFBA 2 . . . . . 348 68
PFBS 132 . 2 16 2131 44 . .
PFBSA 37 . . . . 111 345 68
PFDA 468 . 77 16 1465 111 373 296
PFDoA 376 . 2 4 128 155 358 296
PFDS 187 . 2 4 128 . 345 68
PFDSI 89 . . . . . . .
PFHPA 156 . 2 16 2637 155 373 68
PFHpS 126 . . . . . 345 .
PFHXA 136 . 2 16 2637 155 373 68
PFHxDA 19 . . . . . . .
PFHXS 366 . 2 16 426 44 345 68
PFNA 465 . 77 . 1440 111 373 296
PFNOA . . . 16 1196 . . .
PFOA 420 . 280 46 2610 158 373 296
PFOcDA 2 . . . . . . .
PFOS 433 103 311 46 2637 53 373 .
PFOSA . . 2 16 1468 . . .
PFOSD 342 . . . 852 . . 296
PFOSF18 . . . . . 1 . .
PFPeA 9 . . . . . 373 68
PFPEDA 2 . . . . 111 . .
PFTDA 181 . 75 . 128 111 344 68
PFTrDA 368 . 75 . 128 111 344 296
PFUnda 470 . 75 . 1112 111 373 220
Now let’s focus on the PFAS 24. Here are the relevant number of measurements by contracting party. PFBS has been mapped to PFBSA and PFNOA has been mapped to PFNA. The three compounds with no ICES codes are omitted.
It turns out that:
These have been corrected in the table below.
# check no data submitted as both PFBS and PFBSA (or PFNA and PFNOA)
check <- data |>
dplyr::filter(determinand %in% c("PFBS", "PFBSA")) |>
dplyr::select(sample, matrix) |>
anyDuplicated()
if (check != 0) stop()
check <- data |>
dplyr::filter(determinand %in% c("PFNA", "PFNOA")) |>
dplyr::select(sample, matrix) |> anyDuplicated()
if (check != 0) stop()
data <- dplyr::mutate(
data,
determinand = dplyr::case_match(
determinand,
"PFBS" ~ "PFBSA",
"PFNOA" ~ "PFNA",
.default = determinand
),
determinand = dplyr::if_else(
determinand == "PFHXS" & country == "United Kingdom",
"n-PFHXS",
determinand
),
determinand = dplyr::if_else(
determinand == "PFHXS" & country == "Sweden",
"n-PFHXS",
determinand
),
determinand = dplyr::if_else(
determinand == "PFBSA" & country == "Sweden",
"n-PFBSA",
determinand
),
determinand = dplyr::if_else(
determinand == "PFOS" & country == "Norway",
"n-PFOS",
determinand
)
)
new_codes <- na.omit(MIME_codes$param)
new_codes <- setdiff(new_codes, "PFOSD")
new_codes <- append(new_codes, "n-PFBSA", after = match("PFBSA", new_codes))
new_codes <- append(new_codes, c("n-PFHXS", "br-PFHXS"), after = match("PFHXS", new_codes))
new_codes <- append(new_codes, "n-PFHpS", after = match("PFHpS", new_codes))
new_codes <- append(new_codes, c("n-PFOS", "br-PFOS"), after = match("PFOS", new_codes))
new_codes <- append(new_codes, c("n-PFDS", "br-PFDS"), after = match("PFDS", new_codes))
data |>
dplyr::filter(determinand %in% new_codes) |>
dplyr::mutate(determinand = factor(determinand, new_codes)) |>
with(table(determinand, country)) |>
print(zero.print = ".")
country
determinand Denmark France Germany Ireland Norway Sweden The Netherlands United Kingdom
PFBA 2 . . . . . 348 68
PFPeA 9 . . . . . 373 68
PFHXA 136 . 2 16 2637 155 373 68
PFHPA 156 . 2 16 2637 155 373 68
PFOA 420 . 280 46 2610 158 373 296
PFNA 465 . 77 16 2636 111 373 296
PFDA 468 . 77 16 1465 111 373 296
PFUnda 470 . 75 . 1112 111 373 220
PFDoA 376 . 2 4 128 155 358 296
PFTrDA 368 . 75 . 128 111 344 296
PFTDA 181 . 75 . 128 111 344 68
PFHxDA 19 . . . . . . .
PFOcDA 2 . . . . . . .
PFBSA 169 . 2 16 2131 . 345 68
n-PFBSA . . . . . 155 . .
PFPeS . . . . . . . .
PFHXS 366 . 2 16 426 . 345 .
n-PFHXS . . . . . 155 . 68
br-PFHXS . . . . . 111 . 68
PFHpS 126 . . . . . 345 .
n-PFHpS . . . . . . . .
PFOS 433 103 311 46 . 53 373 .
n-PFOS . 20 75 . 2637 111 . 296
br-PFOS . . . . . 111 . 135
PFDS 187 . 2 4 128 . 345 68
n-PFDS . . . . . 111 . .
br-PFDS . . . . . 111 . .
HFPO-DA 2 . . . . . . .
ADONA . . . . . . . .
Here is a summary of the number of samples by species, matrix and country. There are a range of species and tissues including WO for polar bear, which is probably a submission error.
data |>
dplyr::filter(determinand %in% new_codes) |>
dplyr::mutate(species = paste(species, matrix)) |>
dplyr::distinct(sample, species, .keep_all = TRUE) |>
with(table(species, country)) |>
print(zero.print = ".")
country
species Denmark France Germany Ireland Norway Sweden The Netherlands United Kingdom
Cepphus grylle EH 3 . . . . . . .
Clupea harengus LI . . . . . 115 . .
Fulmarus glacialis EH 10 . . . . . . .
Fulmarus glacialis MU 20 . . . . . . .
Gadus morhua LI . . . . 2369 14 . .
Globicephala melas LI 51 . . . . . . .
Globicephala melas MU 39 . . . . . . .
Haematopus ostralegus EG . . . . . 11 . .
Limanda limanda LI 1 . . . . . . 265
Limanda limanda MU . . 278 . . . . 3
Magallana gigas SB . 17 . . . . 48 .
Merlangius merlangus LI . . . . . . . 9
Merlangius merlangus MU . . . . . . . 5
Mytilus edulis SB . 106 . 46 149 . 41 .
Neogobius melanostomus LI 2 . . . . . . .
Ovis aries LI 8 . . . . . . .
Platichthys flesus LI 26 . . . . . 150 .
Platichthys flesus MU 2 . 31 . . . . .
Pleuronectes platessa LI 12 . . . . . 134 9
Pleuronectes platessa MU . . . . . . . 5
Pusa hispida BB 50 . . . . . . .
Pusa hispida LI 60 . . . . . . .
Somateria mollissima BL . . . . 59 . . .
Somateria mollissima EH . . . . 60 . . .
Sterna hirundo EG . . . . . 11 . .
Ursus maritimus BB 30 . . . . . . .
Ursus maritimus LI 117 . . . . . . .
Ursus maritimus WO 43 . . . . . . .
Zoarces viviparus LI 12 . . . . 13 . .
Zoarces viviparus MU . . 2 . . . . .
What do we do if only one of the branched and linear compounds has been submitted? There are three sets of branched and linear compounds with data: PFHXS, PFOS and PFDS. There are also submissions of linear PFBSA (but no corresponding branched submissions).
PFDS: no need to worry because Sweden has always submitted both branched and linear compounds and no other contracting party has submitted either.
PFBSA: only Sweden has submitted n-PFBSA. According to the lab responsible for the analysis, the branched isomer will only be a few percent of the total, so n-PFBSA can effectively be taken to be PFBSA. With no supporting data available, this will have to be taken on trust.
PFOS and PHFXS: there are samples for which only the linear isomer has been submitted. For each compound in turn, the data set was filtered to give all the samples with both branched and linear forms and with both measurements above the limit of detection (or quantification).
# PFOS
wk <- dplyr::filter(data, determinand %in% c("n-PFOS", "br-PFOS"))
# check units and basis are consistent within samples
check <- wk |>
dplyr::select(sample, matrix, unit, basis) |>
unique() |>
dplyr::select(sample, matrix) |>
anyDuplicated()
if (check != 0) stop()
wk <- dplyr::select(
wk,
sample, matrix, species, determinand, censoring, value
)
wk <- tidyr::pivot_wider(wk, names_from = determinand, values_from = c(value, censoring))
wk$determinand <- "PFOS"
names(wk)[pmatch("value_br", names(wk))] <- "branched"
names(wk)[pmatch("value_n", names(wk))] <- "linear"
names(wk)[pmatch("censoring_br", names(wk))] <- "branched_q"
names(wk)[pmatch("censoring_n", names(wk))] <- "linear_q"
out <- wk
out <- dplyr::filter(out, !(is.na(branched) | is.na(linear)))
out <- dplyr::filter(out, is.na(branched_q) & is.na(linear_q))
out_PFOS <- dplyr::mutate(out, proportion_linear = 100 * linear / (linear + branched))
# PhFXS
wk <- dplyr::filter(data, determinand %in% c("n-PFHXS", "br-PFHXS"))
# check units and basis are consistent within samples
check <- wk |>
dplyr::select(sample, matrix, unit, basis) |>
unique() |>
dplyr::select(sample, matrix) |>
anyDuplicated()
if (check != 0) stop()
wk <- dplyr::select(
wk,
sample, matrix, species, determinand, censoring, value
)
wk <- tidyr::pivot_wider(wk, names_from = determinand, values_from = c(value, censoring))
wk$determinand <- "PFHXS"
names(wk)[pmatch("value_br", names(wk))] <- "branched"
names(wk)[pmatch("value_n", names(wk))] <- "linear"
names(wk)[pmatch("censoring_br", names(wk))] <- "branched_q"
names(wk)[pmatch("censoring_n", names(wk))] <- "linear_q"
out <- wk
out <- dplyr::filter(out, !(is.na(branched) | is.na(linear)))
out <- dplyr::filter(out, is.na(branched_q) & is.na(linear_q))
out_PFHXS <- dplyr::mutate(out, proportion_linear = 100 * linear / (linear + branched))
out <- list(PFOS = out_PFOS, PFHXS = out_PFHXS)
There are 184 such samples for PFOS and 72 for PFHXS.
Here are plots of the two components, first for PFOS:
lattice::xyplot(
branched ~ linear,
xlab = "n-PFOS ug/mg",
ylab = "br-PFOS ug/mg",
scales = list(log = TRUE, equispaced.log = FALSE),
data = out$PFOS
)
And now for PFHXS:
lattice::xyplot(
branched ~ linear,
xlab = "n-PFHXS ug/mg",
ylab = "br-PHXS ug/mg",
scales = list(log = TRUE, equispaced.log = FALSE),
data = out$PFHXS
)
It looks like many of the br-PFHXS measurements are actually censored (those with an identical value), but haven’t been recorded as such. It turns out (see later) that this conversion is not critical, so there is no point investigating further.
Typically:
lapply(out, function(x) summary(x$proportion_linear))
$PFOS
Min. 1st Qu. Median Mean 3rd Qu. Max.
70.80 87.28 91.05 90.02 94.15 99.57
$PFHXS
Min. 1st Qu. Median Mean 3rd Qu. Max.
21.82 66.67 81.82 75.38 88.01 96.57
MIME 2024 advocated using the linear PFOS concentration as a substitute for total PFOS concentration when the branched concentration is not submitted. This seems reasonable, although not precautionary. An alternative is to inflate the linear concentration by a factor of 1.1 to account (crudely) for the discrepancy. Similarly, the linear concentration of PFHXS could be inflated by a factor of 1.3.
Now let’s look at profiles to identify the ‘most important’ compounds in any analysis of the PFAS 24. This can be done in so many ways! The approach taken here, as a first attempt, was to:
Each contracting party measures a different selection of PFAS compounds, so profiles were considered for each contracting party in turn. For each contracting party:
Note that some there is likely to be double accounting when, for example, one sample is from the muscle and another is from the liver of the same fish.
Here’s a plot of the proportions:
wk <- dplyr::filter(data, determinand %in% new_codes)
# check units and basis are consistent within samples
check <- wk |>
dplyr::select(sample, matrix, unit, basis) |>
unique() |>
dplyr::select(sample, matrix) |>
anyDuplicated()
if (check != 0) stop()
wk <- dplyr::select(
wk,
country, sample, matrix, species, determinand, censoring, value
)
wk <- dplyr::mutate(wk, censoring = dplyr::if_else(is.na(censoring), FALSE, TRUE))
# make censored concentrations zero because lods vary so much (within and
# between samples) and we will base profiles on samples with at least e.g. four
# non-censored values
wk <- dplyr::mutate(wk, value = dplyr::if_else(censoring, 0, value))
wk <- tidyr::pivot_wider(wk, names_from = determinand, values_from = c(value, censoring))
names(wk) <- gsub("value_", "", names(wk))
names(wk) <- gsub("censoring_", "q_", names(wk))
names(wk) <- gsub("-", "_", names(wk), fixed = TRUE)
wk <- dplyr::mutate(
wk,
# PFOS
.id = is.na(PFOS) & !is.na(br_PFOS) & !is.na(n_PFOS),
PFOS = dplyr::if_else(!.id, PFOS, br_PFOS + n_PFOS),
q_PFOS = dplyr::if_else(!.id, q_PFOS, q_br_PFOS & q_n_PFOS),
.id = is.na(PFOS) & !is.na(n_PFOS),
PFOS = dplyr::if_else(!.id, PFOS, n_PFOS * 1.1),
q_PFOS = dplyr::if_else(!.id, q_PFOS, q_n_PFOS),
# PFDS
.id = is.na(PFDS) & !is.na(br_PFDS) & !is.na(n_PFDS),
PFDS = dplyr::if_else(!.id, PFDS, br_PFDS + n_PFDS),
q_PFDS = dplyr::if_else(!.id, q_PFDS, q_br_PFDS & q_n_PFDS),
# PFHXS
.id = is.na(PFHXS) & !is.na(br_PFHXS) & !is.na(n_PFHXS),
PFHXS = dplyr::if_else(!.id, PFHXS, br_PFHXS + n_PFHXS),
q_PFHXS = dplyr::if_else(!.id, q_PFHXS, q_br_PFHXS & q_n_PFHXS),
.id = is.na(PFHXS) & !is.na(n_PFHXS),
PFHXS = dplyr::if_else(!.id, PFHXS, n_PFHXS * 1.3),
q_PFHXS = dplyr::if_else(!.id, q_PFHXS, q_n_PFHXS),
# PFBSA
.id = is.na(PFBSA) & !is.na(n_PFBSA),
PFBSA = dplyr::if_else(!.id, PFBSA, n_PFBSA * 1.0),
q_PFBSA = dplyr::if_else(!.id, q_PFBSA, q_n_PFBSA),
.id = NULL
)
id <- names(wk)
id <- id[!grepl("br_", id)]
id <- id[!grepl("n_", id)]
wk <- wk[id]
# retain samples with at least six measurements and four or more non-censored
# values
id <- dplyr::select(wk, dplyr::starts_with("q_"))
n_censored <- apply(id, 1, function(x) {x <- na.omit(x); sum(x)})
n_total <- apply(id, 1, function(x) {x <- na.omit(x); length(x)})
n_noncensored <- n_total - n_censored
wk <- dplyr::filter(wk, n_total >= 6 & n_noncensored >= 4)
wk <- dplyr::select(wk, - dplyr::starts_with("q_"))
# get relative potency factors
# rpf <- MIME_codes |>
# dplyr::select(param, rpf) |>
# dplyr::filter(!is.na(param)) |>
# dplyr::filter(param != "PFOSD") |>
# dplyr::mutate(
# rpf = sapply(strsplit(rpf, "-", fixed = TRUE), tail, n = 1),
# rpf = as.numeric(rpf)
# ) |>
# tibble::column_to_rownames("param")
rpf <- MIME_codes |>
dplyr::select(param, rpf) |>
dplyr::filter(!is.na(param)) |>
dplyr::filter(param != "PFOSD") |>
tibble::column_to_rownames("param")
wk <- split(wk, wk$country)
out <- lapply(wk, function(x) {
x <- dplyr::select(x, - country, - sample, - matrix, - species)
n <- apply(x, 2, function(y) {y <- na.omit(y); length(y)})
x <- x[names(n)[n >= 20]]
x <- x[complete.cases(x), ]
stopifnot(names(x) %in% row.names(rpf))
rpf <- rpf[names(x), "rpf"]
prop <- apply(
x,
1,
FUN = function(y, rpf) {
y <- y * rpf
y <- 100 * y / sum(y)
},
rpf = rpf
)
prop <- t(prop) |> as.data.frame()
average <- 100 * colSums(prop) / sum(colSums(prop))
prop$sample <- 1:nrow(prop)
prop <- tidyr::pivot_longer(prop, -sample, names_to = "determinand", values_to = "prop")
list(prop = prop, average = average)
})
out_prop <- lapply(out, "[[", "prop") |> dplyr::bind_rows(.id = "country")
out_average <- lapply(out, function(x) {
x <- x$average
x <- data.frame(determinand = names(x), prop = x)
row.names(x) <- NULL
x
})
out_average <- dplyr::bind_rows(out_average, .id = "country")
out_final <- dplyr::bind_rows(list(prop = out_prop, average = out_average), .id = "type")
out_final <- dplyr::mutate(
out_final,
determinand = factor(determinand),
determinand = reorder(determinand, prop, mean)
)
stripplot(
determinand ~ prop | country, data = out_final,
scales = list(alternating = FALSE),
layout = c(3, 2),
panel = function(x, y, subscripts) {
data <- out_final[subscripts,]
id <- data$type == "prop"
lpoints(x[id], y[id], col = grey(0.7), pch = 16, cex = 0.7)
id <- data$type == "average"
ord <- order(x[id], decreasing = TRUE)
lpoints(x[id][ord[1:6]], y[id][ord[1:6]], col = "red", pch = 16)
lpoints(x[id][ord[-c(1:6)]], y[id][ord[-c(1:6)]], col = "blue", pch = 16)
}
)
And here’s the same plot on the square root scale to better visualise differences between the compounds. Note that the proportions for Germany are ‘elevated’ relative to the other countires because they are based on only seven compounds.
stripplot(
determinand ~ sqrt(prop) | country, data = out_final,
xlab = "proportion",
layout = c(3, 2),
scales = list(
alternating = FALSE,
x = list(at = sqrt(c(0, 5, 15, 30 , 60)), labels = c("0", "5", "15", "30", "60"))
),
panel = function(x, y, subscripts) {
data <- out_final[subscripts,]
id <- data$type == "prop"
lpoints(x[id], y[id], col = grey(0.7), pch = 16, cex = 0.7)
id <- data$type == "average"
ord <- order(x[id], decreasing = TRUE)
lpoints(x[id][ord[1:6]], y[id][ord[1:6]], col = "red", pch = 16)
lpoints(x[id][ord[-c(1:6)]], y[id][ord[-c(1:6)]], col = "blue", pch = 16)
}
)
Apart from Germany, the six ‘most important’ compounds are always PFOS, PFNA, PFDA, PFUnda, PFDoA, and PFTrDA. Germany don’t report PFDoA.
Note that this does not include PFOA, which is the compound with a relative potency factor of 1.
The six ‘most important’ compounds measured by all contracting parties (that contribute to the profile analysis above) are PFOS, PFNA, PFDA, PFUnda, PFTrDA and PFOA.
The table below shows the proportions explained by each compound.
wk1 <- dplyr::filter(
out_final,
type == "average"
)
wk1 <- tidyr::pivot_wider(wk1, names_from = country, values_from = prop)
wk1 <- dplyr::select(wk1, -sample, -type)
wk1 <- as.data.frame(wk1)
wk1 <- tibble::column_to_rownames(wk1, "determinand")
ord <- rev(levels(out_final$determinand))
ord <- ord[ord %in% row.names(wk1)]
wk1 <- wk1[ord, ]
round(wk1, 3)
Denmark Germany Norway Sweden The Netherlands United Kingdom
PFOS 37.245 45.640 33.296 47.125 52.334 49.952
PFNA 39.744 22.732 4.312 9.708 20.774 23.338
PFDA 13.176 16.562 11.635 12.860 14.498 15.524
PFUnda 6.768 9.471 24.547 19.280 7.190 4.727
PFDoA 0.747 NA 11.736 4.887 1.965 2.132
PFTrDA 1.256 3.146 12.212 4.716 1.001 1.510
PFOA 0.268 2.431 0.041 0.630 0.768 0.881
PFDS 0.254 NA 2.036 0.092 0.845 0.945
PFHXS 0.166 NA 0.017 0.448 0.466 0.447
PFHpS 0.315 NA NA NA 0.150 NA
PFTDA 0.030 0.017 0.162 0.214 0.004 0.178
PFHPA 0.030 NA 0.000 0.034 0.000 0.325
PFBA NA NA NA NA 0.003 0.023
PFPeA NA NA NA NA 0.000 0.014
PFHXA 0.001 NA 0.006 0.004 0.000 0.005
PFBSA 0.000 NA 0.000 0.002 0.000 0.000
And here are the cumulative proportions. The top six compounds explain over 97% of the ‘total’ PFOA equivalent concentration in all cases (where they are all reported). Although it must be remembered that we don’t have the true total concentration, because some of the PFAS 24 are not reported at all.
wk2 <- wk1
for (i in 2:nrow(wk2)) {
wk2[i,] <- apply(wk1, 2, function(x) {if (is.na(x[i])) NA else sum(x[1:i], na.rm = TRUE)})
}
round(wk2, 2)
Denmark Germany Norway Sweden The Netherlands United Kingdom
PFOS 37.24 45.64 33.30 47.12 52.33 49.95
PFNA 76.99 68.37 37.61 56.83 73.11 73.29
PFDA 90.17 84.93 49.24 69.69 87.61 88.81
PFUnda 96.93 94.40 73.79 88.97 94.80 93.54
PFDoA 97.68 NA 85.53 93.86 96.76 95.67
PFTrDA 98.94 97.55 97.74 98.58 97.76 97.18
PFOA 99.20 99.98 97.78 99.21 98.53 98.06
PFDS 99.46 NA 99.81 99.30 99.38 99.01
PFHXS 99.62 NA 99.83 99.75 99.84 99.46
PFHpS 99.94 NA NA NA 99.99 NA
PFTDA 99.97 100.00 99.99 99.96 100.00 99.63
PFHPA 100.00 NA 99.99 99.99 100.00 99.96
PFBA NA NA NA NA 100.00 99.98
PFPeA NA NA NA NA 100.00 100.00
PFHXA 100.00 NA 100.00 100.00 100.00 100.00
PFBSA 100.00 NA 100.00 100.00 100.00 100.00
Here are the number of PFAS measurements by each contracting party.
country
determinand Germany Ireland The Netherlands
n-PFOS . . 363
PFBS 38 . .
PFDA 38 212 .
PFDS . 212 .
PFHPA 38 212 .
PFHpS . 212 .
PFHXA 39 212 .
PFHXS 38 212 .
PFNA . 212 .
PFNOA 38 . .
PFNS . 212 .
PFOA 38 271 .
PFOS 144 271 .
PFOSD 38 . .
Focussing on the PFAS 24, here are the relevant number of samples by contracting party. PFBS has been mapped to PFBSA and PFNOA has been mapped to PFNA. The three compounds with no ICES codes are omitted.
# check no data submitted as both PFBS and PFBSA (or PFNA and PFNOA)
check <- data |>
dplyr::filter(determinand %in% c("PFBS", "PFBSA")) |>
dplyr::select(sample, matrix) |>
anyDuplicated()
if (check != 0) stop()
check <- data |>
dplyr::filter(determinand %in% c("PFNA", "PFNOA")) |>
dplyr::select(sample, matrix) |> anyDuplicated()
if (check != 0) stop()
data <- dplyr::mutate(
data,
determinand = dplyr::case_match(
determinand,
"PFBS" ~ "PFBSA",
"PFNOA" ~ "PFNA",
.default = determinand
)
)
new_codes <- na.omit(MIME_codes$param)
new_codes <- setdiff(new_codes, "PFOSD")
new_codes <- append(new_codes, "n-PFBSA", after = match("PFBSA", new_codes))
new_codes <- append(new_codes, c("n-PFHXS", "br-PFHXS"), after = match("PFHXS", new_codes))
new_codes <- append(new_codes, "n-PFHpS", after = match("PFHpS", new_codes))
new_codes <- append(new_codes, c("n-PFOS", "br-PFOS"), after = match("PFOS", new_codes))
new_codes <- append(new_codes, c("n-PFDS", "br-PFDS"), after = match("PFDS", new_codes))
data |>
dplyr::filter(determinand %in% new_codes) |>
dplyr::mutate(determinand = factor(determinand, new_codes)) |>
with(table(determinand, country)) |>
print(zero.print = ".")
country
determinand Germany Ireland The Netherlands
PFBA . . .
PFPeA . . .
PFHXA 39 212 .
PFHPA 38 212 .
PFOA 38 271 .
PFNA 38 212 .
PFDA 38 212 .
PFUnda . . .
PFDoA . . .
PFTrDA . . .
PFTDA . . .
PFHxDA . . .
PFOcDA . . .
PFBSA 38 . .
n-PFBSA . . .
PFPeS . . .
PFHXS 38 212 .
n-PFHXS . . .
br-PFHXS . . .
PFHpS . 212 .
n-PFHpS . . .
PFOS 144 271 .
n-PFOS . . 363
br-PFOS . . .
PFDS . 212 .
n-PFDS . . .
br-PFDS . . .
HFPO-DA . . .
ADONA . . .
The linear component of PFOS has been submitted by the Netherlands, but there are no corresponding branched measurements. However, the Netherlands supplied both linear and branched PFOS measurements from their national database, which are plotted below.
wk <- read.csv(
file.path("extra_resources", "L_PFOS SbranchedPFOS zeewater.csv"),
sep = ";",
na.strings = ""
)
wk <- dplyr::select(
wk,
PARAMETER_.CODE, WAARNEMINGDATUM, WAARNEMINGTIJD..MET.CET.,
ALFANUMERIEKEWAARDE
)
names(wk) <- c("determinand", "date", "time", "value")
# a small number of replicates
wk <- dplyr::distinct(wk, determinand, date, time, .keep_all = TRUE)
wk1 <- tidyr::pivot_wider(wk, names_from = determinand, values_from = value)
wk1 <- as.data.frame(wk1)
# get rid of anomalous (huge) values
wk1 <- dplyr::filter(wk1, L_PFOS < 1000)
wk1 <- dplyr::mutate(wk1, prop = 100 * L_PFOS / (L_PFOS + sverttPFOS))
lattice::xyplot(
sverttPFOS ~ L_PFOS,
xlab = "n-PFOS ug/ml",
ylab = "br-PFOS ug/ml",
scales = list(log = TRUE, equispaced.log = FALSE),
data = wk1
)
The linear component is typically about 60% of the total, which would suggest inflating n-PFOS concentrations by 1.7 to get total PFOS concentrations.
summary(wk1$prop)
Min. 1st Qu. Median Mean 3rd Qu. Max.
32.08 54.55 59.23 59.11 63.85 76.08
Applying the same criteria as for biota leaves 33 samples submitted by Germany and one sample sumbitted by Ireland. So let’s just look at the German data.
wk <- dplyr::filter(data, determinand %in% new_codes)
# check units and basis are consistent within samples
check <- wk |>
dplyr::select(sample, filtration, unit) |>
unique() |>
dplyr::select(sample, filtration) |>
anyDuplicated()
if (check != 0) stop()
wk <- dplyr::select(
wk,
country, sample, filtration, determinand, censoring, value
)
wk <- dplyr::mutate(wk, censoring = dplyr::if_else(is.na(censoring), FALSE, TRUE))
# make censored concentrations zero because lods vary so much (within and
# between samples) and we will base profiles on samples with at least e.g. four
# non-censored values
wk <- dplyr::mutate(wk, value = dplyr::if_else(censoring, 0, value))
wk <- tidyr::pivot_wider(wk, names_from = determinand, values_from = c(value, censoring))
names(wk) <- gsub("value_", "", names(wk))
names(wk) <- gsub("censoring_", "q_", names(wk))
names(wk) <- gsub("-", "_", names(wk), fixed = TRUE)
wk <- dplyr::mutate(
wk,
.id = is.na(PFOS) & !is.na(n_PFOS),
PFOS = dplyr::if_else(!.id, PFOS, n_PFOS * 1.11),
q_PFOS = dplyr::if_else(!.id, q_PFOS, q_n_PFOS),
.id = NULL
)
id <- names(wk)
id <- id[!grepl("br_", id)]
id <- id[!grepl("n_", id)]
wk <- wk[id]
# retain samples with at least six measurements and four or more non-censored
# values
id <- dplyr::select(wk, dplyr::starts_with("q_"))
n_censored <- apply(id, 1, function(x) {x <- na.omit(x); sum(x)})
n_total <- apply(id, 1, function(x) {x <- na.omit(x); length(x)})
n_noncensored <- n_total - n_censored
wk <- dplyr::filter(wk, n_total >= 6 & n_noncensored >= 4)
wk <- dplyr::select(wk, - dplyr::starts_with("q_"))
wk <- dplyr::filter(wk, country == "Germany")
wk <- split(wk, wk$country)
out <- lapply(wk, function(x) {
x <- dplyr::select(x, - country, - sample, - filtration)
n <- apply(x, 2, function(y) {y <- na.omit(y); length(y)})
x <- x[names(n)[n >= 20]]
x <- x[complete.cases(x), ]
stopifnot(names(x) %in% row.names(rpf))
rpf <- rpf[names(x), "rpf"]
prop <- apply(
x,
1,
FUN = function(y, rpf) {
y <- y * rpf
y <- 100 * y / sum(y)
},
rpf = rpf
)
prop <- t(prop) |> as.data.frame()
average <- 100 * colSums(prop) / sum(colSums(prop))
prop$sample <- 1:nrow(prop)
prop <- tidyr::pivot_longer(prop, -sample, names_to = "determinand", values_to = "prop")
list(prop = prop, average = average)
})
out_prop <- lapply(out, "[[", "prop") |> dplyr::bind_rows(.id = "country")
out_average <- lapply(out, function(x) {
x <- x$average
x <- data.frame(determinand = names(x), prop = x)
row.names(x) <- NULL
x
})
out_average <- dplyr::bind_rows(out_average, .id = "country")
out_final <- dplyr::bind_rows(list(prop = out_prop, average = out_average), .id = "type")
out_final <- dplyr::mutate(
out_final,
determinand = factor(determinand),
determinand = reorder(determinand, prop, mean)
)
stripplot(
determinand ~ prop, data = out_final,
panel = function(x, y, subscripts) {
data <- out_final[subscripts,]
id <- data$type == "prop"
lpoints(x[id], y[id], col = grey(0.7), pch = 16, cex = 0.7)
id <- data$type == "average"
ord <- order(x[id], decreasing = TRUE)
lpoints(x[id][ord[1:6]], y[id][ord[1:6]], col = "red", pch = 16)
lpoints(x[id][ord[-c(1:6)]], y[id][ord[-c(1:6)]], col = "blue", pch = 16)
}
)
The six ‘most important’ compounds are PFOS, PFNA, PFOA, PFHXS, PFHPA and PFDA. But the evidence base is weak! Ireland also report all these compounds.
At a meeting (5 February 2025) of MIME representatives of contracting parties that submit a suite of PFAS compounds it was decided that, for the 2025 CEMP Assessment:
These decisions will be reviewed before the 2026 CEMP Assessment.
Notes:
It was also decided to change the way in which (weighted) sums are computed in harsat. In previous assessments, censored values have been included in the sum, but for the 2025 assessment censored values will be set to zero (unless all values are censored). This is in line with EU Water Framework Directive recommendations.