You are here: Home > Capozzi Winery
My second review starts with the following words:
A gorgeous dry white that was so aromatic and balanced that I thought it was an excellent example of Gewurtztraminer from Alsace, one of my favorite wine regions.
Was I right? Click here to find out.
[...] Sound good? Cool. My latest review is here. [...]

Dear Josh,
I really like your thoroughness of your Uber review system–I have lots of thoughts, but I’ll just share one here. I wonder if the best review would be one where the reviewer was at least aware of the varietal, rather than double blind, especially since evaluations of wines in the Uber system are indexed to varietals.
Great work!
Since scores aren’t assigned until after the wine is revealed, I don’t see how knowing anything about the wine beforehand will improve the review. It can only be a detriment, since then preconceived notions of what the wine “should” be creep into the reviewers mind. In wine there are just too many senses that come into play for us to be able to keep up, so we put them in boxes out of basic necessity. But that necessity also can blind us to qualities in a wine, both positive and negative. Understanding what variety you are dealing with before assigning a score is important – but knowing what the variety is before tasting it isn’t.
Thanks for the insightful comment and the kind words!
Josh,
Interesting note/methodology. I wonder about a few things:
1) Given that scoring seems to be based at least in part on “typicity” and/or varietal character (however those are defined/determined) how can such a high rating be applied to a wine that can be mistaken by a critic as coming from another variety?
2) One of the stated goals of the system is to counter score inflation. But I’ve yet to see a major publication rate a torrontes this highly (or a rosé 100 points), so I am unclear on whether your system achieves this goal or instead results in even more high scores. We need more data points.
3) Which brings me to my third question: scaleability. Do you think it is feasible to publish 10,000 or more such reviews in a year? In an era when a premium seems to be placed on writing shorter and shorter items, do you think there are enough interested readers to support such an endeavor?
Finally, thanks for your efforts. I am always interested in ways wine criticism may evolve and the various philosophies behind reviewing wines.
Great questions. I’ll do my best.
Let me begin though by saying that I’m not holding up my palate as the standard. My scoring reflects my personal preferences, and at this point (as you rightly point out) you don’t have enough information to calibrate your preferences with mine. I’m more interested in the system.
1. This seems perilously close to asking the dreaded age-old question “what is quality?†The answer is, of course, whatever I say it is (yay!). But to address your specific example: people mistake all sorts of wines all across the quality spectrum for other varieties, regions and vintages all the time when tasting blind. It’s the nature of the beast, and it says more about the limits and consistency of our sensory abilities than anything about a wine scoring system. The best we can do is remove potential bias from the process.
2. Interesting. Where you see score inflation, I see the breaking of a glass ceiling. But yours is certainly a valid point. However the fact that no one has ever rated a rosé 100 points simply argues for my point: are you saying no one has ever made a 100 point rosé, when rated based solely on its own merits?
I’ve had one that’s at least 98: Schlossgut Diel. But no one will ever give it higher than a 90 because they have to leave headroom for the more serious Bordeaux, Burgundys, Hermitages, and Brunellos etc. Eliminate the need to reserve that headroom and you should see scores float up. And since we aren’t comparing apples and oranges anymore, they should be more “accurateâ€.
3. No, and a qualified yes. As I said when I first proposed the system, it is too time consuming, too expensive, too much potential for embarrassment on the part of the critic, not to mention 100 other reasons why it is impractical.
But! that doesn’t mean it isn’t the best system. And properly marketed to, consumers usually respond to quality. Were producers forced to have their labs certified under more rigorous standards than currently apply, and provide that info to reviewers, much of the added expense would evaporate.
In the end though, critics will never taste double blind and expose their reputations in the way I propose. It’s just too darn scary.
Thanks for the comments and the feedback!
Josh,
I am not so sure that not knowing the variety upfront “removes potential bias from the system.” I think it’s likely that the fact you mistook it for a gewürztraminer from Alsace (one of your favorite wine regions) affected your rating, so while it may eliminate one type of bias, it potentially introduces another–this time based on a guess, not a given.
I’m also confused by the process of tasting the wine double blind and scoring it open label. Doesn’t this reintroduce the same label biases into scoring (preconceived notions about region/variety/producer) you were striving to eliminate? It may lead to a tasting note that more accurately reflects what is in the glass, but how does it affect the rating?
Apples and Oranges. By rating each type of wine within its own idiom (or perceived idiom), it seems to me the system sets up a myriad of 100-point scales that cannot be directly compared, potentially leading to confusion in the market. The acceptance of the 100-point scoring system stems from its simplicity and intuitiveness; this system requires consumers to apply a filter (“wine type”) over the score to be able to properly understand it.
Scaleability may not be important to hobbyist-bloggers, but for a commcercial enterprise and widespread acceptance, it’s a necessity. While the lab costs–perhaps you could share what those would be for each wine–are part of the issue, so are labor costs. At $50 or more per hour for a reviewer’s time, this becomes a huge barrier unless there is an affluent audience prepared to pay a substantial premium for this content. I’m not convinced that’s the case.
Although I can’t speak for other reviewers, tasting double-blind wouldn’t scare me. Heck, folks spring wines on me all the time in small settings and I get them wrong regularly without dying of shame–my latest (Friday night) was calling an Adelaide Hills Sauvignon Blanc as Marlborough. Where it becomes interesting is when critics are asked to rate wines blind (especially those that they have previously scored). That (consistency) is probably a whole other subject, and one I think would be very interesting to study.
Joe,
Thanks for the follow up! Long inline response coming…
“I think it’s likely that the fact you mistook it for a gewürztraminer from Alsace (one of your favorite wine regions) affected your rating, so while it may eliminate one type of bias, it potentially introduces another–this time based on a guess, not a given.”
Not sure I follow your logic there. I didn’t like the Torrontés because it tasted like “Gewürz from Alsace”. That’s putting the cart before the horse. I like Gewürtz from Alsace because it has certain characteristics: full body, intense aromatics, etc. This Torrontés had those qualities, and that they were present in the wine is what is important to me, not where it ended up being from.
“I’m also confused by the process of tasting the wine double blind and scoring it open label. Doesn’t this reintroduce the same label biases into scoring (preconceived notions about region/variety/producer) you were striving to eliminate? It may lead to a tasting note that more accurately reflects what is in the glass, but how does it affect the rating?”
The blind note is where the rubber meets the road.
It was my impression of the wine before I knew anything about it. After I write it down, the note doesn’t change. Caught up in that assessment, though not explicitly stated as a score, is the seed of the rating to come. This is important because it gives the consumer an audit trail of sorts. Did an otherwise nondescript wine suddenly get a much higher score than it seems it deserved after the reveal? Also, do the labs confirm the observations of the critic. She said there was high acid – was there? She said there was earthiness – was there Brett or was it something else?
There is no eliminating the subjective nature of wine criticism. The goal of the system isn’t to eliminate it, its to make the process as objective as it can be. I’ve never met a sensory scientist yet who feels that there is a better method to accomplish this than a double blind tasting. No matter what the score ends up being, it will have to be consistent with what the reviewer wrote before the reveal or they end up risking their credibility.
“Apples and Oranges. By rating each type of wine within its own idiom (or perceived idiom), it seems to me the system sets up a myriad of 100-point scales that cannot be directly compared, potentially leading to confusion in the market. The acceptance of the 100-point scoring system stems from its simplicity and intuitiveness; this system requires consumers to apply a filter (â€wine typeâ€) over the score to be able to properly understand it.”
You are correct. Each wine idiom gets its own scale. Where I disagree is that this will cause any more confusion in the market than is already there. An absolute score doesn’t meet the goal you cite. Most magazines split their reviews out by variety and region as it stands, and I don’t accept that any meaningful information came be gleaned by comparing scores across different regions or idioms under the current system. Most wine mags have different reviewers scoreing different regions, and this confounds comparing scores more than whether the system is relative or absolute.
And finally most consumers only want to know if the wine is good. Folks that rely on scores exclusively care only if the wine was rated above 90, not where it stands compared to every other Burgundy on the planet.
“Scaleability may not be important to hobbyist-bloggers, but for a commcercial enterprise and widespread acceptance, it’s a necessity. While the lab costs–perhaps you could share what those would be for each wine–are part of the issue, so are labor costs. At $50 or more per hour for a reviewer’s time, this becomes a huge barrier unless there is an affluent audience prepared to pay a substantial premium for this content. I’m not convinced that’s the case.”
Not going to argue with you re: the business side. I readily concede the fact. Again though, that doesn’t mean this isn’t the best system available. Imagine if wine producers used this type of rationale for the quality of their products. Then there really wouldn’t be very many 100 point wines!
The average cost per bottle for labs is close to $40, but a 4-EP/4-EG test alone can run $65 and ethyl acetate and aldehyde tests are over $100. You have to pick the ones that will best contribute to the wine review.
“Where it becomes interesting is when critics are asked to rate wines blind (especially those that they have previously scored). That (consistency) is probably a whole other subject, and one I think would be very interesting to study.”
Totally agree! Parker has taken alot of heat recently over this on Bordeaux for instance. But its a human failing, not something personal to him IMO. Which is another benefit of the system! Reviewers will be wrong in their guesses probably 70% of the time (especially with blends), so there is no false sense of infallibility associated with the review. That’s empowering for a consumer.
Thanks tons for the conversation Joe!
Josh,
Nice work here. I’m not going to go as in depth as the other commenters, because really I’m just completely geeked out over the Wikipedia in-blog pop ups! I reference the site a lot in my pieces, and think this is an awesome tool – did you write something to implement this yourself or is it a feature of the mobile wikipedia site? Such a great idea! Looking forward to your ongoing analytical adventures.
Thanks for the kind words Ryan!
The pop up tech is actually courtesy of a wordpress plug-in called WikiPop. You can download it here http://wordpress.org/extend/plugins/wikipop/
It’s a really nice way to approach all this, but in the end, I fear the shelf talker would (accurately) quote from your review: “Excellent Wine at Very Good Value….” Egads
Josh,
I’m sorry if my line of reasoning wasn’t clear. I believe that one’s guess at a wine’s identity can influence (bias) one’s perception of quality (and vice versa), just as knowing a wine’s identity can.
Since the system doesn’t call for the wine to be scored double blind (or even single blind), I guess this matters less–the reviewer assigns a score in part based on knowing the wine’s varietal makeup, region of origin and producer. In short, all those biases the system is trying to remove from the tasting process are reintroduced as part of the scoring process. Then why not rate the wines double-blind?
I agree that segmentation of the review universe among different reviewers even under the same publication banner creates difficulty in comparing ratings from one region to another. Adding yet another variable (wine style/idiom) doesn’t make it any better, IMO. The original beauty of Parker’s WA was the single reviewer, single voice and the perceived consistency it delivered.
As far as the lab tests, how does one determine which “will best contribute to the wine review?” Sure, it’s easy enough if something reeks of Band-Aid and clove to check the 4-EP/4-EG box on the lab request, but wouldn’t that also be valuable info if the reviewer doesn’t report any sensory evidence and the lab result were to come back positive? Then you’d know the reviewer has a high tolerance for or poor perception of that sort of thing (at least on that day, in that wine).
All this “talk” is making me thirsty, so if you ever get to NY we should pull some corks.
Josh,
Given your (double-) blind method where the only info available to you is wine color, it sounds like Uber system is a system that evaluates wine based on color: red wine, white wine and rose. For example, x is a 94 point white wine, that turns out to be 100% Torrontes. I find this perfectly acceptable.
Would sparklers be in their own category (rather than the color of the sparkler) for evaluation? Also fortefied wines seem to defy classification by color.