Skip to content

fix(embed): don't use a non-leading heading as embedded notebook title#14639

Draft
cderv wants to merge 8 commits into
mainfrom
fix/issue-14577
Draft

fix(embed): don't use a non-leading heading as embedded notebook title#14639
cderv wants to merge 8 commits into
mainfrom
fix/issue-14577

Conversation

@cderv

@cderv cderv commented Jun 30, 2026

Copy link
Copy Markdown
Member

TL;DR

An embedded notebook with no title and no heading cell used a code-cell comment (e.g. # plt.savefig(...)) as its "Source" title. This PR gates heading-to-title promotion in findTitle on "nothing precedes the heading" (option A), so a # inside a code fence can't be the title and the filename is used instead. A side effect: a real markdown heading with prose before it also falls back to the filename — intentional, because a standalone notebook's own title already behaves that way. A more surgical fence-aware fix (option B) is described below but not taken here.


When embedding a notebook that has no YAML title and no markdown heading cell, the "Source" link and sidebar showed a Python code comment (e.g. # plt.savefig(...)) instead of the notebook filename.

Root Cause

findTitle() in src/core/jupyter/jupyter-embed.ts scans every cell's markdown to find a title. A code cell's markdown is its fenced source, and partitionMarkdown is not fence-aware, so a # comment line inside the fence matches the ATX-heading regex and gets returned as the title.

The non-embed path already guards against exactly this. The Dec-2023 "Improve title snipping in notebooks" change (24672a4, #5363/#6411) made fixupFrontMatter only promote a heading to the notebook title when no content precedes it. findTitle predates that work and was never brought in line, so it still promoted any heading-like line.

Fix (option A)

Expose contentBeforeHeading (already computed by markdownWithExtractedHeading) through PartitionedMarkdown, and gate findTitle on !partitioned.contentBeforeHeading. A fenced code block's opening delimiter always counts as content before the comment, so the comment is no longer mistaken for a title and the filename is used instead.

This borrows the same content-before-heading gate that fixupFrontMatter uses. The two functions are not otherwise equivalent — findTitle scans every cell and can't filter on cell_type, whereas fixupFrontMatter inspects only the first markdown cell.

Behavior note

Because the gate keys off "content before the heading" rather than "is this a code cell", it also affects a real markdown cell whose heading has prose before it: that heading is no longer promoted, and the embed "Source" title falls back to the filename.

This is intentional and is the reason to prefer option A. A standalone notebook with prose before its first heading already gets no title from that heading — fixupFrontMatter only promotes a heading when nothing precedes it (since #5363 / #6411). Option A makes the embed "Source" title behave the same way as the notebook's own title, instead of using a heading that the rest of Quarto would not treat as the document title.

The documented precedence is YAML title, then the first markdown heading, then the filename. Option A is marginally stricter than the literal "first markdown heading" wording for the prose-before-heading case, but matches the actual title behavior of a standalone notebook.

Alternative (option B, not taken here)

The root cause mcanouil identified is that heading extraction is not fence-aware. Option B would make markdownWithExtractedHeading (or findTitle) skip # lines inside fenced code, fixing only the code-comment case and leaving a prose-before-heading markdown cell promotable — matching the literal documented "first markdown heading" wording.

Option B is the more surgical fix for the reported bug, but it touches every caller of markdownWithExtractedHeading and would leave the embed title behaving differently from the standalone notebook title for the prose-before-heading case. This PR takes option A to keep the two title code paths consistent; option B remains available if we later prefer to track the documented wording exactly.

Tests

Two smoke-all tests under tests/docs/smoke-all/2026/06/30/:

  • 14577.qmd — the reported case: an embed of a notebook whose only cell is a #-comment code cell, asserting the Source link uses the filename and never the comment.
  • embed-heading-after-content.qmd — the general alignment: an embed of a notebook whose only heading has content before it, asserting the Source link falls back to the filename rather than the heading.

Two incidental test-infra cleanups came out of writing that test:

  • render-embed.test.ts teardowns now use safeRemoveSync instead of raw Deno.removeSync, so a teardown no longer throws (and aborts, leaving stale state) when a preview artifact is already gone.
  • smoke-all's postRenderCleanup now removes recursively, so a registered cleanup entry can be a *_files support directory and not only a single file.

Fixes #14577

@posit-snyk-bot

posit-snyk-bot commented Jun 30, 2026

Copy link
Copy Markdown
Collaborator

Snyk checks have passed. No issues have been found so far.

Status Scan Engine Critical High Medium Low Total (0)
Open Source Security 0 0 0 0 0 issues
Licenses 0 0 0 0 0 issues

💻 Catch issues earlier using the plugins for VS Code, JetBrains IDEs, Visual Studio, and Eclipse.

cderv added 7 commits June 30, 2026 18:14
…tle (#14577)

When embedding a notebook with no YAML title and no markdown heading cell,
the "Source" link and sidebar showed a Python comment (e.g. `# plt.savefig(...)`)
instead of the notebook filename.

findTitle() in jupyter-embed.ts scans every cell's markdown, including code
cells whose markdown is their fenced source. partitionMarkdown is not
fence-aware, so a `#` comment line inside the fence matches the ATX-heading
regex and was returned as the title.

The non-embed code path already guards against this: the Dec-2023 "Improve
title snipping in notebooks" work (24672a4, fixes #5363/#6411) made
fixupFrontMatter only promote a heading to a notebook title when no content
precedes it. findTitle predates that change and was never brought in line, so
it still promoted any heading-like line. Apply the same `!contentBeforeHeading`
gate here by exposing contentBeforeHeading (already computed by
markdownWithExtractedHeading) through PartitionedMarkdown. A fenced code
block's opening delimiter always counts as content before the comment, so the
comment is no longer mistaken for a title and the filename is used instead.
The teardowns called raw Deno.removeSync, which throws NotFound when a
preview artifact is already absent (e.g. a prior teardown partially ran, or
a sibling test cleaned a shared path). A throwing teardown aborts the test
and leaves the fixture directory half-cleaned, so the next run trips over
stale state. safeRemoveSync (deno_ral/fs.ts) tolerates already-removed
paths and only rethrows genuine errors, keeping cleanup idempotent.
postRenderCleanup registered entries were removed with a non-recursive
Deno.removeSync, so an entry could only be a single file. Embedded-notebook
tests produce a `*_files` support directory alongside the preview HTML, which
could not be declared for cleanup and was left behind. Use safeRemoveSync with
recursive removal so a registered entry can be a directory as well as a file.
The regression test for the embedded-notebook title fix started as a bespoke
testRender block in render-embed.test.ts with manual teardown of the preview
artifacts. smoke-all is the established home for issue-regression tests and its
framework already handles output cleanup, so move it there: a 14577.qmd that
embeds notebook.ipynb and asserts the Source link uses the filename, not the
Python comment, with the preview artifacts declared for postRenderCleanup.
These asserted pre-existing, unchanged extraction behavior plus a trivial
pass-through (partitionMarkdown now forwards contentBeforeHeading, a value
markdownWithExtractedHeading already computed). They pinned behavior rather
than exercising the fix. The fix lives in findTitle and is covered end-to-end
by the smoke-all regression test, so the unit additions added no real coverage.
Record why findTitle can't mirror fixupFrontMatter's markdown-cells-only
guard: it runs on rendered JupyterCellOutput, which carries no cell_type, so
the !contentBeforeHeading check is the available equivalent for excluding a
code cell's fenced source.
Add a smoke-all test for an embedded notebook whose only heading has prose
before it: the heading must not become the embed title, which falls back to
the filename. This is the same rule fixupFrontMatter applies to a notebook's
own front-matter title since 24672a4 (#5363/#6411); the embed title now
aligns with it. Broaden the changelog entry to describe this general behavior,
not just the code-comment case.
@cderv cderv force-pushed the fix/issue-14577 branch from 93acbbd to d095db3 Compare June 30, 2026 16:16
@cderv cderv marked this pull request as draft June 30, 2026 16:26
…hared gate

The comment and changelog claimed findTitle now derives the title the same way
fixupFrontMatter does. Only the content-before-heading gate is shared: findTitle
still scans every cell (fixupFrontMatter inspects only the first markdown cell)
and cannot filter on cell_type, so the two are not equivalent. State that the
gate is borrowed and list the divergences, so the comparison is not read as a
guarantee of identical behavior.
@cderv

cderv commented Jun 30, 2026

Copy link
Copy Markdown
Member Author

I am putting that as a draft, because maybe option B is better...

I need to think about it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Python comments wrongly rendered as title in the sidebar of embeded notebooks

2 participants