Hacker News new | past | comments | ask | show | jobs | submit login
Microformats Wiki (microformats.org)
132 points by Tomte on Aug 29, 2022 | hide | past | favorite | 67 comments



PSA: the best way to enable Microformat-like stuff with almost no effort is to keep describing content with reasonable class names even if you never pay any special attention to the microformats.org standards.

One unfortunate side effect of things like Tailwind CSS is that some folks with a single-minded focus on looks are now using only the Tailwind class names, but you can do both. If before you'd have written something like class="video-thumbnail" and styled the content with a class selector in your CSS, but now you've adopted Tailwind and have excised all the descriptive class name from the content (rather than just rming your stylesheet and adding Tailwind classes to whatever class names are already there), then please put the descriptive class names back in! And if you have colleagues submitting patches that remove class names because you're migrating to Tailwind, tell them to use Tailwind classes in a way that adds to the classification instead of as a replacement for traditional, descriptive class names.

(A similar thing happens with CSS compilers. The autogenerated stuff served up on podcasts.google.com is nasty, for example.)


Is there any evidence this matters at all?


Aside from, uh, being affected by it? I don't know what "matters" means to you.


Matters how? Microdata seems irrelevant to begin with, ignored by search engines or anything else (at least in my limited tests), and faking it with semi-semantic class names alongside auto generated ones will do what, exactly?

Like, what is this additional data layer actually for? What humans or software use it?


> Matters how?

I don't know. That's your word. You're the one who wrote it. I'm asking you.

> faking it with semi-semantic class names

Huh?


I mean why go through all this effort at all?


You're going to have to put more effort communicating that you've thought about what you're asking (including effort in laying out what you're even asking) if you expect any effort to be put into a response. Of your three comments here, two out of three are (ambiguous) one-liners, and three out of three come off like you took all of 10 seconds to bang them out before hitting the "reply" button.

Even if that happens, I'm past 50/50 on the question of "is this even worthy of further engagement?" at this point. (It's pretty much at a 100% "no".)


It's fine if you don't want to keep responding... but in case you want clarity, what I was asking is "Why should anyone spend any time at all adding microdata, or failing that, using CSS class names to emulate microdata?"

Your original comment read to me like, "Here's how you can do something similar using CSS class names without using the microformats standards." My question was, why do either?

With microformats, at least it's supposed to be a machine-parseable standard for crawlers and the like. It never got much traction, and of the times I've used it, absolutely no measurable benefits (in terms of SEO or indexability) were seen at all.

With descriptive CSS class names you come up with yourself, not adhering to any standard, it seems even less likely that any crawler would be able to use them meaningfully. It's even less "semantic" than microdata or semantic HTML5 tags.

So my question was, what is the benefit of doing that? Are these class names just reminders for yourself, months later, so you can remember what a certain component was supposed to do? Or does it help your development/debugging/testing workflow somehow (such as being able to easily target widgets in the DOM for automated testing, perhaps?)

I wasn't trying to dismiss your comment, just trying to understand the value of putting in that effort. I'm sorry if it came across as flippant, there's just a lot that gets lost in online posts. Anyway, feel free to elaborate if you'd like, but if you just want to move on, that's totally fine too :)


> One unfortunate side effect of things like Tailwind CSS is that some folks with a single-minded focus on looks are now using only the Tailwind class names, but you can do both. If before you'd have written something like class="video-thumbnail" and styled the content with a class selector in your CSS, but now you've adopted Tailwind and have excised all the descriptive class name from the content

Are you really suggesting that people should make Cascading Style Sheets do more than styling? We have literal entire specs for using correct attributes for accessibility and microformats. WAI, ARIA, and so on.

Furthermore, from the authors of Tailwind, Headless UIhttps://headlessui.com/ is a collection of fully unstyled logic components with their primary concern being accessibility (done correctly, with aforementioned attributes, not random class names). At no point have users or maintainers of Tailwind tried to steer people away from including accessibility or microdata in their designs.

This feels like an incredibly thinly veiled attack on Tailwind by the ever vocal anti-Tailwind interest group for something it shouldn't be doing anyway.


> Are you really suggesting that people should make Cascading Style Sheets do more than styling?

No. I'm talking about the class attribute, not style sheets. The single-minded minded Web developers with a focus only on styling that I mentioned seem to have formed the belief that class names are just "a CSS thing" (or something)—and that's the problem.

> At no point have users or maintainers of Tailwind tried to steer people away from including accessibility or microdata in their designs.

K. Who're you arguing with you here?

> This feels like an incredibly thinly veiled attack on Tailwind by the ever vocal anti-Tailwind interest group

Consider that leaping to conclusions like this indicates that you might be living in a bubble. (See "single-minded focus".) I'm not part of some rival frontend gang. I don't even exist in the milieu—the world is way bigger than visual designers doing Web work who spend their discretionary time talking about the trade.


> The single-minded minded Web developers with a focus only on styling

Again with the attacks.

There is no evidence that eg Google uses arbitrary class names in order to index metadata over the accepted standards of microdata.

https://developers.google.com/search/docs/advanced/structure...

Given I know what microdata is, when to use it, and how it's different to CSS, I'm not sure what bubble this would be. Perhaps the one that is familiar with Google crawling?


Where do you see an attack there...? That quote is a completely anodyne, factual description. There is zero pejorative (or even emotional content in it, for that matter).

> There is no evidence that eg Google uses arbitrary class names in order to[...]

Again: who are you arguing with? Every response you've written here has the quality of being written as if addressing someone or something that doesn't even appear in the discussion.

> I'm not sure what bubble this would be

The one that goes (roughly), "clearly I am addressing a fellow tradesperson who is biased against other factions, and therefore it is reasonable to declare this to be a thinly veiled attack that I have assumed it is". If you spend your time squabbling with colleagues (and/or trying to deflect attacks on Twitter[1]), and when you come across someone you immediately leap to the conclusion that they're working against you[1] on professional–ideological grounds, then there might be some bubble-backed paranoia at work.

I don't care about your opinionated in-trade squabbles. I don't occupy either "side" of the debate that you've tried to insert me into* any more than your aunts and uncles and grandparents do. (What are their thoughts on going with Tailwind versus not? I suppose they'll let you know when they get their first client for frontend work and it comes up—or even could come up—right? Me too. Until then...)

* and this is all not even to mention—as the other person who responded to you pointed out—that the position you attributed to me is completely opposite of what I actually wrote

1. <https://news.ycombinator.com/item?id=22332525>


You opened by saying that "folks" are using Tailwind CSS instead of your preferred arbitrary class names that would mean nothing to browsers. Tailwind is a styling tool. There is nothing stopping you from using additional random class names you believe might make a difference to search engines - there is also nothing encouraging you to do that either other than a footnote in the HTML spec.

I don't know why you don't remember you opened with that, but, after seeing attacks like "single minded", "bubbles", "professional ideological grounds", merely for telling you that in the real world people are not using CSS class names in the way you want them to be and are instead using microdata, I won't be engaging in this pointless debate further.

https://developers.google.com/search/docs/advanced/structure...


> There is nothing stopping you from using additional random class names

Congratulations on arriving at the realization that matches exactly what I wrote out in my first message. That you can include both was my entire thesis, dude!

That you missed that and feel that "single-minded"[1] is an attack[2] explains the conflict here. You're reading things out of comments that aren't actually written there[3] and failing to read out of them what actually is.

I'm glad that this exchange is over.

1. <https://en.wiktionary.org/wiki/single-minded>: Intensely focused and concentrated on purpose, thinking of only one goal, undistractable.

2. Are you perhaps mistaking it with "small-minded"?

3. That stuff, along with being able to "make a difference to search engines" mentioned, once again. Why? To reiterate—in response to what, exactly? Who are you arguing with?


> Are you really suggesting that people should make Cascading Style Sheets do more than styling?

The parent made no such suggestion. You seem to be conflating the purpose of HTML classes with CSS. Here's a note from the "literal entire spec"[0], which other commenters have already pointed out:

> […] authors are encouraged to use values that describe the nature of the content, rather than values that describe the desired presentation of the content.

Furthermore, the parent took great pains not to attack Tailwind, instead pointing out that the two different approaches could easily coexist.

If you'd like to use class names solely as CSS hooks, go right ahead, but citing "correct attributes" is ironic at best.

[0]: https://html.spec.whatwg.org/multipage/dom.html#global-attri...


There is no evidence that eg Google uses arbitrary class names in order to index metadata over the accepted standards of microdata.

https://developers.google.com/search/docs/advanced/structure...


I'm not Google but when I write a scraper I frequently use class selectors to pick out data. It's a rare occasion when all the information I need from a page is tagged as microdata.


You could probably just chuck those class names in declaration anyway, even they werent attached to any styles!

Of course, having well considered structured semantic works very well!

For instance, with images, you can use attached caption markup to describe an images purpose and reference it in the written body of the document.


Or you can spit out the important stuff from the state (or model) as JSON and chuck that in a div?


This is the first time I’ve heard of css classes used as a substitute for micro format data, do you have a reference for that? Also, from my understanding microformat really isn’t being used much anymore and the preferred method nowadays is json-ld.


> css classes

There's no such thing. There are already detailed comments in this thread about this mixup.


I'd argue that using the `class` attribute for describing the content of a field is overloading it, and that it should only be used for linking to CSS styles.

You're better off using another attribute, e.g. some data-* attribute, for semantic information.


The HTML Standard disagrees:

> […] authors are encouraged to use values that describe the nature of the content, rather than values that describe the desired presentation of the content.

[*]: https://html.spec.whatwg.org/multipage/dom.html#global-attri...


What possible benefit can come from doing this, except to appease the standard writers? Describing the presentation of content is the sole purpose of CSS. I don't see why it should matter to anyone except the author herself what names are chosen for the presentation rules.


Describing the presentation of content is the sole purpose of CSS.

But describing the presentation of content isn't the sole purpose of the `class` attribute. Having meaningful domain-related names included in the class makes UI testing a lot easier.


As I said in my parent comment, there's other attributes better suited to purposes like this rather than overloading the `class` attribute.

You can, for example, use a `data-testid` attribute if the element can't be identified reliably with another attribute, such as the `id` or `role`.


A person might get the impression based on your comments that classes are somehow "for" CSS (they're not) and that using them for some non-CSS use is somehow "wrong" (it's not). Meanwhile almost everything you're saying here is opposite what the HTML spec says.

As I said in my (earlier) response to you: using the "class" attribute for semantic, non-styling purpose is not "overloading" or misusing it. It is simply using it. (Just like the folks who developed CSS were able to get use out of element classes by baking tier-one support for them into their selector language. CSS has no claim to the right and proper use of an attribute that predates CSS.) To argue against this is revisionism.

The HTML5 data attributes also aren't a replacement for proper use of the class attribute, either. What the class attribute encodes (surprise) the class that the content belongs to. The data attribute payload encodes a particular value associated with a given key for a particular instance (e.g. the testid payload from your own example). And aside from the use cases being incomparable, the HTML spec says they shouldn't be used the way you're saying.


Accessibility for disabled users count?


User agents usually don't have general artificial intelligence and often can't infer content intent from class names. Accessibility doesn't magically improve just because of classes.

Accessibility help relies on using the correct, semantic tags, and when that is not possible, WAI-ARIA attribute hints.


No, you should be using ARIA attributes, such as `aria-role` for this where it cannot be inferred by the element's tag name.


That ship has already sailed if you're using Tailwind though.


No it hasn't. As I said in my first comment, you can stuff your Tailwind selectors into the class list _and_ still put the content class annotations in the class attribute.

It's as easy as:

  <img class="video-thumbnail w-24 h-24 rounded shadow"[...]
... instead of swapping out "video-thumbnail" class name with the Tailwind stuff (replacing it). They are not mutually exclusive.


Solution: don't use Tailwind.


> I'd argue that using the `class` attribute for describing the content of a field is overloading it

You, like several others commenters seem to have an internalized belief (probably based on assumption/association?)* that content class names are a contribution of the CSS spec. (This is the only way to interpret your "overloading" remark.)

Just because you can use the classes in your CSS as selectors doesn't mean that classes are just a CSS "thing" (or whatever) and that doing something else with them is "overloading" them. That's silly. The class attribute is for, like, you know, specifying the class of the content.

* (This raises the question: Is the word "class" just an opaque token to people? They've never typed out the sequence 'c', 'l', 'a', 's', 's', '=' and pondered it at all? Or do they, but it's limited to something akin to, "If you want to put a border or something then you put 'class' here on the element and then you can CSS it. I dunno why they called it that, but that's what you write, and it lets you do the border later." What do they think of the fact that you can also create CSS selectors based on element ID? Are element IDs, then, another just-a-CSS-thing in their minds? Or what about that you can make selectors using the element name itself? Or that class selectors are not the default—if you want to use a class-based selector in a style rule, you actually have to escape from the default mode by using a leading dot to specify that what follows should be interpreted as a class name? If classes were strictly within the purview of CSS, shouldn't the class selector be the default...?)


It's worth noting Microformats is introduced 17 years ago, and was last updated about 12 years ago. It never really picked up traction, largely because players like Google have their own knowledge graph, and similar structured formats like RSS have lost their popularity since then.


That's not exactly accurate. The wiki has been edited recently, and change discussions are now done through github issues before updating the wiki, as that is more convenient than inline chat in many cases https://github.com/microformats Also a lot of the practical discussion of microformats use is at the indieweb wiki - see https://indieweb.org/posts#Types_of_Posts for h-entry for example


SEO was always a good argument to adopt some of the microformats. Search engines can make use of that to pick apart content. It's just that we have a few more tools than just microformats for that now; including some structural html 5 tags.

What died out is attempts to do stuff with microformats via browser extensions. This was once a thing that e.g. MS did when they launched Edge. Those extensions have largely disappeared or just never really caught on.


I used to wonder "What would Social Networking look like if it were based on self-hosted pages written with microformats like h-card[0]?", but presumably the answer is "a privacy nightmare" (even more so than existing social networks).

Perhaps a better answer to that question is "the Fediverse", which is based on ActivityPub. Unfortunately, though, there still seem to be open questions about how well/widely the Fediverse supports something like "circles" (the concept that Google Plus had) for selectively sharing content with different groups of people.

[0] http://microformats.org/wiki/h-card

[1] https://rusingh.com/fediverse-google-plus-circles/


Haven[1] is my open-source, self-hosted side project for building exactly this (social networking based on self-hosted pages) in a privacy-first manner. There's a lot from the IndieWeb community that I want to emulate, and microformats is one of them--partly in order to create compatibility with the various MF2/IndieAuth client (phone) apps.

My experience/opinion is that a circles-based approach is a usability nightmare since relationships are fluid and fuzzy, so Haven only has one "share group"--everyone with access to your Haven. That said, it's all built on open protocols (RSS), so it would be pretty straightforward for someone to fork it or build their own that implemented a circles-style approach to sharing.

[1]: https://havenweb.org


The specific idea, way back then, was DiSo [1]. Believe it or not, privacy wasn't the concern back then. The real problem was UX. Setting up and maintaining this stuff was Hard, and the network effect mattered a lot more in getting adoption. Facebook was really good at onboarding flows. OpenID lost to Facebook Connect for similar reasons.

1: https://diso-project.org/


UX definitely ended up being one of the hardest parts. We also never got to the point of really building a compelling product on top of the protocols. We did actually do some work on privacy though, but it never got super far. For example, you could set simple access controls on content that was only available to friends that logged in with an OpenID. I think I had it so that certain of your hcard attributes were only included for authenticated users as well.

Some of the work we did then still exists today in various forms. Activity Steams led to Activity Pub, which has seen far more adoption than we ever did.

And microformats are still widely used in some communities, though certainly not like it could have been, largely things to Google putting weight behind schema.org. For example, nearly all of the indieweb.org work is based around microformats at the core, with things like webmention, micropub, etc


The man himself! Thanks for chiming in.


Yes, this is something were facebook excels at, but it can also create in-group and out-group dynamics. A possible federated solution: https://beesbuzz.biz/blog/6128-Federated-access-control-with...


I don't get it. The big problem with the old meta tags is that nobody follows standards[1], and they were frequently abused to misrepresent the document in a favorable way (especially keywords and description).

How would this help alleviate those problems?

[1] To this day. I've learned, from building a parser for my search engine, you can't just select the <title> tag if you want the title of a page. You need to select the title tag in the <head>-tag. Otherwise you'll get the title-tags people semi-regularly use in the body as well... Like not just hobbyists. Found a major American university that used title-tags to wrap navigational links.


I keep finding these when I least expect it and they've been fantastic every time - it usually means whatever I'm trying to automate is 90% solved for me.

Scraping recipes from the bbc, flight reservations from my email, etc. If something includes schema.org/RsvpAction it probably needs to be actioned, if something contains schema.org/DiscountOffer it can probably go in the Noise folder.

If someone starts sending spam containing flight reservation meta I'm going to be so sorely disappointed in the human race.


I've gotten spam calendar appointments that my email client has helpfully added to my calendar so that I'm helpfully reminded of their spam sometime in the future with a push notification.


Good grief that has to be confusing. I'm often temporarily baffled by legit entries when notified, letalone strange or unexpected spam entries :)


Plus <title> tag is valid in SVG context, where it does what `title` attribute does on HTML elements (provides content for HTML tooltips and screen readers). And since inline SVG is valid in HTML, you can get valid title tags (from SVG namespace) outside HEAD element

Plus if there wasn't TITLE in the head and is encountered inside BODY (not valid), it is adopted into the HEAD from document object model's perspective as if it was hoisted there (but it is not, just its value):

    data:text/html;charset=utf-8,<head>
    <!-- <title>not here</title> -->
    </head>
    <body bgcolor=dimgray>
    <svg viewbox="0 -15 80 20" fill=cyan>
     <title>SVG Tooltip</title>
     <text>Some SVG<text>
    </svg>
    <title>Title For Adoption.</title>
    <title>Second late title.</title>
    <p>And the title is: »<output id=o></output>«.
    <body onload="o.value = document.title">
    <body text=snow>
Resulting paragraph reads: "And the title is: »Title For Adoption.«."

("Yet unseen attributes" of the consecutive BODY tags are physically "hoisted" to real body node, but it's a different chapter.)

Plus both <head> / </head> tags are optional — and implied — so "selecting title in head tag" can become very hard task, if taken seriously.

Making HTML parser ain't easy.


You complicate the description and make it sound much harder than it actually is.

Concerning HTML parsing in general: it is easy, genuinely easy, one of the easier things to implement, because it’s well-defined, and in a format that matches the implementation. You’re basically just translating the algorithm from pseudocode into code. Sure, it’s long, but it’s not hard.

<head> and </head> being optional is purely a parser concern, that the start and end tags are optional. Implement the parser, and you get that behaviour automatically, and nothing beyond that needs to worry about it at all.

For that matter, you don’t need to worry about head or body in determining the title, because here’s how the document title is actually determined, per https://html.spec.whatwg.org/multipage/dom.html#the-title-el...:

> The title element of a document is the first title element in the document (in tree order), if there is one, or null otherwise.

(And then it goes on to describe further processing done for the document.title attribute.)

The only subtlety in this explanation is that when it speaks of title elements, it’s speaking of HTML title elements only. There’s nothing complicated or difficult about this. There’s no adoption, it’s just taking the first HTML title anywhere in the document.

(If implementing this in browser JavaScript, you can’t just use document.querySelector("title") because it ignores namespaces. The most efficient way will be to use document.evaluate() with an XPath like "//title/text()" which matches the required child text content nodes.)


I'm not sure if I follow your line of thoughts correctly. I agree that yes, HTML parsing is well defined in current standard, but (subjectively, for me) it is far from "easy", given the amount of states and back-compat "burden" with plenty of often overlooked exceptions. (I like to explore them but I'd never try to make a compliant HTML parser from scratch, probably.)

I was in fact responding to

> […] I've learned, from building a [HTML] parser […] you need to select the title tag in the <head>-tag.

what gave me impression OP really rolls his own HTML parser and relies on some possibly dangerous assumptions that are not in fact granted (however well defined) by the specs — for me one of such assumption is especially "HTML parsing is easy" / "I understand HTML well enough to parse it myself" — and wanted to point out possible further gotchas wrt "selecting the <head>-tag" (e.g. that it may be hard when there is no <head> tags) or that "selecting the first title tag" could possibly not give the right one.

But maybe I'm just little slow and all those "Idiosyncrasies of the HTML parser" [1] are notoriously known. And sure, many scenarios could be waved out as unimportant border-cases, maybe.

---

As for querying the document, i.e. having the state when the document is ready and "something" did the parsing and tree heavy lifting for us (so we have DOM and JS) we can surely reach for ancient namespaced

    document.getElementsByTagNameNS("http://www.w3.org/1999/xhtml", "title")[0]?.textContent?.trim()

(hopefully there are no further details wrt comments, white-space normalization and entity expansion on top of raw textContent we should take care of) but it makes very little sense when there is `document.title` for this exact purpose.

---

[1] https://htmlparser.info/parser/


To implement an HTML parser, you don’t need to worry about the corner cases at all, because the spec has your back and spells out exactly how every single case should be handled, in the form of state machines, which is how you will implement it. There are involved details, to be sure, but it’s genuinely not hard to follow the spec. The idiosyncrasies document you cite is for authors of HTML: implementers genuinely don’t need to worry about them, because it’s all covered by the spec.

For document.title: naturally in a browser you would use that; I intended to describe just how you would achieve it without that. And I completely forgot about document.getElementsByTagNameNS for some reason, which is of course more sensible than a querySelector + find. Note that .textContent.trim() doesn’t match the algorithm which is spelled out in the spec (just below the earlier link), on two counts. Firstly, all sequences of ASCII whitespace in the middle of the string need to be collapsed to a single space ("\r\n\f\t hello\r\n\f\t world\r\n\f\t " → "hello world"). Secondly, .textContent is insufficient, including the text content of element children as well, whereas the spec says child text content (with a link to the exact definition); HTML syntax can’t produce such elements (the parser switches into RCDATA state), but XML syntax can, as can DOM manipulation by scripting. Examples that are both titled “included” rather than the textContent “inclexcludeduded”:

  data:application/xhtml+xml,<title xmlns="http://www.w3.org/1999/xhtml">incl<b>excluded</b>uded</title>
  data:text/html,<title>incl</title><script>document.querySelector("title").append(b=document.createElement("b"),"uded"),b.append("excluded")</script>


I think these things are very related, but Microformats isn't going to solve this problem on its own. After all, it's been around for quite a while, just like most of these conventions.

10-15 years ago, we had a bunch of different ways everyone was trying to schematize their data on the web, and since we never all agreed to just use one of them, everyone now uses fragments of all of them — and of other pseudo-proprietary systems — in their own unique ways.


Indeed - every silo wants you to use their own specific markup, but even then they choose what to show. I made a post a while back that gave a different summary on every silo platform https://www.kevinmarks.com/partialsilos.html Of course, since then some of the silos have died or given up on their own systems, but microformats remains useful if you want to co-operate.


Related:

Microformats: Still Relevant? - https://news.ycombinator.com/item?id=32285207 - July 2022 (1 comment)

Google confirms microformats are still a supported metadata format for content - https://news.ycombinator.com/item?id=22521666 - March 2020 (40 comments)

Ask HN: Is it worth it to implement HTML5 microformats? - https://news.ycombinator.com/item?id=14515178 - June 2017 (1 comment)

Microformats are easy to learn, and pay off well in SEO and mobile - https://news.ycombinator.com/item?id=4328853 - Aug 2012 (4 comments)

Ask HN: Micro-formats, are they still relevant/useful? - https://news.ycombinator.com/item?id=2702516 - June 2011 (2 comments)

Ask HN: Microformats - Still useful? - https://news.ycombinator.com/item?id=1747657 - Oct 2010 (5 comments)

Microformats.org at 5: Two Billion Pages With hCards - https://news.ycombinator.com/item?id=1583784 - Aug 2010 (1 comment)

Ask HN: Micro formats? Or, how to make my site's google link look good? - https://news.ycombinator.com/item?id=1224242 - March 2010 (3 comments)

Microformats: Boon or Bane? - https://news.ycombinator.com/item?id=987688 - Dec 2009 (5 comments)

Rest/ahah · Microformats Wiki - https://news.ycombinator.com/item?id=808251 - Sept 2009 (1 comment)

Google Announces Support for Microformats and RDFa - https://news.ycombinator.com/item?id=606126 - May 2009 (8 comments)

If the next "version" of the web is all about semantics, why aren't more people using microformats? - https://news.ycombinator.com/item?id=171818 - April 2008 (34 comments)

Consolidate and take back your social network with XFN, openID and microformats - https://news.ycombinator.com/item?id=29512 - June 2007 (5 comments)


Both Mastodon and Tumblr have microformats support built in. Twitter and Facebook used to, but they decided they'd rather be silos than protocols


Can you expand on this ? I'd be more impressed by microformats if the microformats website could parse a DTD or schema and launch a structured editor for it.


Microformats are basically guidance for scraoers.


Google prefers JSON-LD though[0]. Also I'm not keen on defining content with styling classes.

[0] https://www.searchenginejournal.com/google-structured-data-p...


The class attribute isn't just for styling though.

See https://html.spec.whatwg.org/multipage/dom.html#global-attri...

> authors are encouraged to use values that describe the nature of the content, rather than values that describe the desired presentation of the content.


> values that describe the nature of the content vs > rather than values that describe the desired presentation of the content.

Or could it be that the meaning here is that you should write descriptive classes such as 'alert-box' instead of 'red-box'.


But that gets your "not just for styling" property for free — now another component can use the knowledge that the element is an alert box.


TIL! So those obnoxious JavaScript frameworks using classes for declaring logic are actually valid???!!


As mentioned in https://www.jvt.me/posts/2020/03/02/google-microformats-supp... Microformats are still a recommended route


Not recommended but supported. JSON+LD is recommended. See https://developers.google.com/search/docs/advanced/structure...



http://webdatacommons.org/structureddata/#toc3

The Web Data Commons project extracts all Microdata, JSON-LD, RDFa, and Microformats data from the Common Crawl web corpus


The data that you mark up might be micro but the markup isn’t.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: