Monday, October 17, 2016

CSS Oddities: anonymous inline whitespace nodes

I learned something today. All started with a @Twitter post by @supersole that there is a new feature in @firefoxnighly that now allows debugging "anonymous inline whitespace" nodes in HTML pages.
https://blog.nightly.mozilla.org/2016/10/17/devtools-now-display-white-space-text-nodes-in-the-dom-inspector/

The post claims that imgimg on the page is rendered differently than imgcrlfwhitespacecrlfimg.
I could not believe this. That is stupid right? Which web developer would expect any difference?

Well, it seems that CSS rules - being what they currently are - lead to this unexpected difference.
The CSS spec describes the algorithm to process the HTML here in Phase I: Collapsing and Transformation.
In the second HTML fragment the whitespace is deleted by step 2 which gives us 
imgcrlfcrlfimg.
Step 2 tells us to handle segment breaks ("crlf"). That is described in the Segment Break Transformation Rules.
Those rule give us imgspacespaceimg. Which is then again continued to be processed by the Phase I steps 3 and 4. Step 3 does nothing in this example.

Step 4 reads:

Any space immediately following another collapsible space—even one outside the boundary of the inline containing that space, provided they are both within the same inline formatting context—is collapsed to have zero advance width. (It is invisible, but retains its soft wrap opportunity, if any.) 
 So the remaining two spaces are turned into one (or two - I don't care to check) empty text nodes with zero width but with "soft wrap".

Good to know - maybe. Is this a feature? I expected that everything between two HTMLElements that matches (whitespace)* is completely removed and not inserted into the rendering tree.

Maybe this should be discussed here?: https://github.com/w3c/csswg-drafts/issues
Not my cup of tea.

Thanks to @upsuper who pointed me to the relevant specs.

Monday, October 10, 2016

Twitter Markup

Twitter Cards are around for some time now and I recently wondered how commonly used they are?

There is a nice blog post on Blogger on how to integrate them there but clearly there should be ways for e.g. newspapers to promote their reports by providing summaries and a main image and author information that is not @Twitter specific?  Microformats and schema.org to the rescue?

What does Google do? It seems that JSON-LD is the recommended format.

How would a Twitter Card look in JSON-LD?

Twitter Cards or Rich Cards or @w3c Cards?

Time to standardize!