Chain-Sawing: A Longitudinal Analysis of Certificate Chains
Marcus Döberl, York Freiherr von Wangenheim, Carl Magnus Bruhner, David Hasselquist, Martin Arlitt, Niklas Carlsson
Paper:
Marcus Döberl, York Freiherr von Wangenheim, Carl Magnus Bruhner, David Hasselquist, Martin Arlitt, Niklas Carlsson,
Chain-Sawing: A Longitudinal Analysis of Certificate Chains,
Proc. IFIP Networking,
Thessaloniki, Greece,
June 2024.
(pdf)
Abstract:
The security and integrity of TLS certificates are essential for ensuring secure transmission over the Internet and protecting millions of people from man-in-the-middle attacks. Certificate Authorities (CAs) play a crucial role in issuing and managing these certificates. This paper presents a longitudinal analysis of certificate chains for popular domains, examining their evolution over time and across different categories. Using publicly available certificate data, primarily from crt.sh, we created a longitudinal dataset of certificate chains for domains from the Top 1-M list of Tranco. After categorizing the certificates based on their type and service category, we analyze a selected set of domains over time and identify the patterns and trends that emerge in their certificate chains. Our analysis reveals several noteworthy trends, including a trend towards shorter certificate chains and fewer paths from domains to root certificates. This implies that the certificate process is becoming more simplified and streamlined. Combined with our observations that there is an increasing use of new CAs and a shift in the types of certificates used that we observe, we expect part of this to be an effect of individual choices made by some popular CAs (e.g., less cross-signings). In general, the observed trends, patterns, and findings capture tradeoffs in overhead, backward compatibility, and security. The quick shifts in some of the observed metrics (e.g., chain lengths) therefore also highlight the importance of continued monitoring and analysis of certificate chains.
Datasets
The datasets can be downloaded
here [66.4 MB].
If you use our
dataset
in your research,
please include a reference to our IFIP Networking 2024 paper
(pdf)
in your work.
Dataset A
The dataset contains the certificates belonging to an CA in a “comma”-separated file with the following fields:
-
ca: the ID for an CA in the crt.sh database.
-
commonName: the common name of the certificate authority.
-
crtID: the unique identifier of an certificate in the crt.sh database.
-
issuerCA: the CA ID for the issuer of the certificate.
Dataset B
The dataset contains for each domain subcatergory a “comma”-seperated file with the following fields:
-
ranking: the domains rank on the tranco top 1M list.
-
domainName: the domain name of the domain.
-
year: what year the certificate is valid in, decided by the not-after date.
-
crtID: the unique ID for the certificate from crt.sh.
-
crtCaID: contains the unique crt.sh ID for the CA that issued the certificate.
-
extension: contains the type of the certificate (OV, DV, EV or OTHER).
-
notBefore: contains the not-before date of the certificate.
-
notAfter: contains the not-after date of the certificate.
Citation format
When citing our dataset or work, please cite the conference version of the paper:
-
Marcus Döberl, York Freiherr von Wangenheim, Carl Magnus Bruhner, David Hasselquist, Martin Arlitt, Niklas Carlsson,
"Chain-Sawing: A Longitudinal Analysis of Certificate Chains",
Proc. IFIP Networking,
Thessaloniki, Greece,
June 2024.
(pdf)