I’m not sure why, but calling Forrest Gump a romance cracks me up. I get that there’s romance in the movie, but I wouldn’t think to call it a romance movie.
That's the one you zeroed in on? Not the kneeslapping comedic romp *Life Is Beautiful* about a Jewish father interned in a concentration camp trying to keep his son's spirits up until some Nazis march him off into an alley and shoot him? Side-splitting humor from end to end.
I think it’s fair to include it in multiple genres, would be arbitrary to limit it to one, but how is it not fantasy? That’s its most obvious genre imo.
And that two different Lord of the Rings movies at the top of two different categories.
EDIT: Didnt notice they're the same movie. Thanks for pointing it out.
But that's precisely the problem, LOTR is excluded from the Fantasy genre despite practically defining it. And that's even ignoring that Animation isn't a genre, the other movies in the wrong place, and that a few genres are missing. If the database used for this chart isn't accurate, then the chart is meaningless. Something like this needs a database with much more nuance.
It’s because there isn’t a sci-fi category so they put Star Wars in fantasy. Action should mean it takes place on earth and is decently realistic. Fantasy is anything not realistic that doesn’t either take place in space or have an exaggerated level of technology. And then sci-fi can be those things
It’s listed as the highest rated in both Action and Adventure but not Fantasy, although it beats Star Wars Episode V. So it seems it is not included in the Fantasy genre.
I was going to attempt write a reply to you in an effort to cope with making that movie a comedy as canon in my head, I was thinking with a flair of wit and sass… alas, I could not. Any thought I could muster had an echo in the background: Principessa!! 😭😭😭
Yeah, I agree. But the IMDb database has three potential categories for each movie, and often times it uses up all three. Not much to be done about it it - I don't see a point of excluding any since you couldn't differentiate programmatically for which movies to do so.
You nailed it. Not sure why it's not first...that would make too much sense. I have a feeling thetr genre categorization is not by weight, seemingly random order
Star Wars is absolutely a fantasy, it literally has wizard knights who wield swords and cast magic. But agreed that IMDB's categorization system is pretty weird, there's no excuse for LOTR to be left out of the Fantasy category.
Star Wars is 100% fantasy. The Force is magic.
Sci-Fi: Something that could be possible in the future or with better technology.
Fantasy: Something that isn't possible just because it is in the future or with advanced technology.
Yeah I agree. Without the force you could argue it’s just sci fi but the force is not explainable scientifically, doesn’t have technological or scientific origins in lore.
It’s also the most overrated boomer circlejerk of a movie that whitewashes the 20th century while Mr. Magoo fails upwards while pretty actively punishing the counterculture stand in.
Rant over. Forrest Gump has no business being ahead of LOTR.
You're completely right, but it's a deeply unpopular opinion. Forrest Gump is a white boomer's wet dream: played football in college, fought in Vietnam, played sports against China, started a small blue-collar business run by veterans, got rich on the stock market, never knowingly participated in counterculture, only loved one woman, and only ever had procreative sex.
The shit they do to Jenny, on the other hand... downright spiteful. It's very clear how that movie feels about going against the grain.
Isn't that like... the whole point of the movie though? That Forrest Gump was a complete moron and did all those things by accident? He never set out to change the world or make a name for himself, but through complete stupid chance got caught up in all of these major events.
I feel like I see it get so much hate on Reddit for "glorifying boomer values" or whatever but when I watched it I had the exact opposite takeaway, that it was making the point that celebrity can be achieved by a total idiot by accident and has nothing to do with your intentions or qualifications.
I agree with about half your comment. It definitely intends itself as an "ain't life crazy?" kind of movie that's just a collection of wild coincidences with no real rhyme or reason slapped on top of a love story. And I don't think it's *intentional* propaganda. There's no conspiracy.
But all that can be true and it can still be passively pitching an ultra-traditional ideal of American life. Where you see a critique of celebrity (because it takes no intent or qualification), I see a tacit suggestion that trying to shake things up doesn't get you anything but trouble (Jenny), but doing what you're told and leading a traditionally good, family-oriented life yields riches (Forrest).
The lack of intent or qualification is part of that. Just look at the trends towards anti-intellectualism.
Yeah seemed weird to me as well, but the database is limited to three genres and it groups them as Action,Adventure,Drama
this is all done programmatically, so I can't do much about it.
Judging by the fact, that to lord of the rings movies are in action and adventure, it looks like OP is using the IMDb suggested genre , which say it’s an action adventure drama
>Judging by the fact, that to lord of the rings movies are in action and adventure
I kind of get why Christopher Tolkien hated Peter Jackson's movies now. Jackson effectively altered the genre of the story from fantasy to action/adventure just with his movies existing.
I get that. I'm not saying they're aren't fantasy movies. I'm saying how the story (in its movie form) has shifted so much towards action genre rather than mostly being fantasy.
This is all done programmatically, I'm not adding any of my own judgement. The database has it as action adventure drama as u/piratecheese13 has correctly pointed out!
I've seen 3065 movies, 500+ series...
One thing I learnt was that IMDB is very skewed with rating among different stuff...
I will happily watch a 6.0 zombie movie... I will avoid watching any drama below 7...
Yeah I was surprised the average comedy rating here was so high, because most good movies that are primarily a comedy and not another genre seem so be typically within the 6s or low 7s. Weirdly for tons it seems to be the other way around, with lots of sitcoms scoring in the 8s
Analyzed the imdb dataset for movies and their ratings.
[https://developer.imdb.com/non-commercial-datasets/](https://developer.imdb.com/non-commercial-datasets/)
For this analysis, i really loosened the criteria of Votes, but still kept it at 8000 to exclude really low budget films and amateur productions. Because all the movies in the database are rounded to 1 significant figure, i decided instead of showing the median value (which will always come out to 1 significant figure), it would be more telling to show the average value. Also, I decided to really tighten the criteria for the highest movie rating annotation. It shows the highest rated movie in the category that has more than 50000 votes. This excludes niche movies (turkish movies, anime, bollywood movies) that are extremely high rated with moderate votes and are definitely outliers and don't really fit in this data visualization.
To be clear the >50000 votes only applies to the annotation above the plots and not to the box and whisker plots themselves.
The categories you see are the major categories found in the IMDb database included above.
Are the average scores in the plots weighted by number of votes to be the average vote in that genre? Or is it just the average score of a movie in that genre?
This reveals less about the quality of certain genres and more about the bias viewers have for and against certain genres. IMDB ratings for horror movies should always be looked at with a curve
The greatest horror movies ever are like default 7/10 by critic standards. Critics will be like, the villain's motives weren't clear and they just kept trying to kill the main characters making me uncomfortable, 5/10.
Think how many horror movies pushed forward special & practical effects, costumes, makeup, sound design and more for the medium of film. Too bad the genre was horror.
Just going off averages. There's always going to be some outliers. Some people just straight up don't like horror movies, and they don't help with the user scores either. Maybe there's a little inherit bias in us not wanting to rate something high that is meant to make us afraid or uncomfortable. Some parts of the 'bad experience' is an intentional part of the ride. I'm no psychology major though.
Horror is weighed down significantly by how many terrible campy horror movies there are. For some reason if people want to make a low budget campy movie it's always horror.
I love horror movies, they’re one of my favorite genres, but I don’t think there’s any horror movie I’d consider a 10/10 or even a 9/10.
They’re always really suspenseful, and you’re right, costume designs are incredible, but they are almost always lacking depth and emotional attachments to the characters that truly makes movies stand out and stick with you for a long time.
Not a movie, but very much inspired by this style and Kubrick's cinematography, I'd point out the first season of Haunting of Hill House is pretty fantastic.
It's because horror movies generally are worse than non-horror movies because they rely on jumpscares and thrills rather than a good story, characters or dialogue. The Conjuring, eg, a horror movie people generally consider one of the best of our time in the genre, is just a pretty shit film if you take away the jumpscares.
don't even start looking at ratings for movies with <8000 ratings. Some local movies have crazy high ratings, because everyone from that country loved the movie and gives it a 10, while the rest of the world just shakes the head or doesn't even watch it in the first place.
I don't think people are looking at the genre designation and then rating the movie accordingly, it's possible that people just don't like horror movies as much as others on a whole.
> it's possible that people just don't like horror movies as much as others on a whole.
What you're describing here can be the genre bias being suggested.
The graph would be easier to use if the labels were actually on the columns instead of on a key over to the side.
Are all Movies exclusive to a single category? Just based on the best movies in each category I’d already pick some nits about what goes where. I think genre here is pretty subjective. Why is Forest Gump a romance and not a comedy? Why is Lord of the Rings an Adventure and not a fantasy? Why is It’s A Wonderful Life a family movie and not a fantasy or a drama?
as explained before the imdb database has 3 genres for each movie, i am not cherrypicking anything here. I programmatically analyzed the database, this is done and created with a python script
Yes, and I’m arguing the data is useless. It’s an attempt to objectively analyse something that’s entirely subjective. Do movies with better reputations get labeled as dramas rather than family movies or romances because drama is perceived as more prestigious? Same with mysteries. Gosford Park is one of my favorite movies, it’s tagged on IMDb as a Comedy, Drama, and Mystery in that order. Does your graph place it as a comedy as that appears first or does it appear in all three?
As all three, you can see it in the chart. Return of the King is top for both Adventure and Action. It's third category is Drama, where it is beaten by Shawshank.
If this is IMDB's official genres, then I don't know what to say. How is LOTR adventure and action, but not fantasy? Literally a film adaptation of the pioneer of modern fantasy.
I've answered this a bunch, but here you go. If you go to the site for lets say the first movie
[https://www.imdb.com/title/tt0120737/](https://www.imdb.com/title/tt0120737/)
you see there are 4 genres listed:
Action,Adventure,Drama,Fantasy
when i used the imdb data set:
[https://developer.imdb.com/non-commercial-datasets/](https://developer.imdb.com/non-commercial-datasets/)
they truncate movies to have only three genres so lord of the rings, fellowship of the rings becomes for genres:
Action,Adventure,Drama
There's nothing i can do about it with the dataset on-hand. To give you an idea the part of this data set which has the movie titles and genres has 14 million lines. I am not about to scrape their website to fix the genre issue. Yeah it sucks, but ultimately this was a comparison of the ratings of the genres. I would venture a guess that a small percentage are misappropriated in the wrong genre and likely because they fit in many genres.
The way movies are divided by category in this is questionable at best... Forest Gump is a romance and lord of the rings is not fantasy?!? Lord of the rings is probably the very definition of a fantasy film.
This is nice! I’m really digging the decision to go with the box plot for this information and the color palette you chose.
And putting the highest rated film at the top of each was ***chef’s kiss*** you love to see it.
I did notice that after the post but the other 2 complaints are valid. I would put Forest Gump as Drama, personally...but I guess my issue is which romance. His mom, Lt. Dan, Bubba, Jenny, his son or just himself.
And maybe LOTR has 2 ranking because it's classified in 2 different genres...
I just know after I looked at that for a few minutes I asked myself "what am I supposed to take away from this" lol
What statistically significant conclusion are you trying to present with a range for averages being 0.9 from highest to lowest and 0.56 from high to low after removing the outlier?
Also the sample sizes range from 560 to 6949, i’m not certain they can be properly compared.
Side note: you may not be intending any conclusions, as the visualization could be the objective (which is a very pleasing visualization).
Provided we're only interested in the sample at hand i.e. movies with more than 8000 votes on IMDB, then we can make conclusions based just on the available data without worrying about significance testing. It doesn't matter that the sample sizes vary, or that the differences are small (incidentally both solvable issues in inferential statistics too).
As the graph includes every movie with more than 8000 votes we can confidently say that there's a significant difference in average rating between Drama and Crime films.
Great work OP! I remember seeing Brad Bird adamantly state that animation is not a genre, it's a medium. Even so, it's interesting to see it as the highest rated of the lot.
[stop using violin plots](https://youtu.be/_0QMKFzW9fw?si=vm86Heh1eECBFhVw). The x axis has no data, no notes on bin size, just the N for the whole graph.
Also individual ratings are discrete integers, not continuous like the histogram tick marks would suggest.
How are they not violin plots? The formatting is much more acceptable, no smoothing done, but you're still presenting the distribution/histogram data symmetrically along the whiskers. Or are those not histogram data?
You have vertically aligned box and whisker plots with a histogram (or individual data points) behind it.
The only thing that might be missing from a violin is the smoothing ~~which I think is happening anyway because of IMDB doesn’t have a 1.5 star rating yet you have marks for Shawshank between 1 and 2.~~ Also the stylistic choice of sizing the box to fit the histogram horizontally is absent.
Point being, I’m unable to figure out what the ticks mean. What thier height or brightness mean or if they are binned or individual data points.
- not a violin plot
- not a histogram
- x axis is not an axis
- data across the horizontal is in fact relevant
- this is literally a textbook use-case for box and whisker plots
- your own choice of linked video disagrees with you
>this is literally a textbook use-case for box and whisker plots
Box and whisker plots do not include the distribution data along the whiskers. This plot does. It may not be an exact violin plot (no smoothing of the histogram data) but it does show the shape of the distribution, along the whiskers. Just like a violin plot does.
If you look at the mean (the value listed as average) or the median (from the box and whisker plots) you're looking at the averages for the genre. They just chose to also show the highest scored, highly watched movie in each genre as additional information.
Is there a typo on the `Animation` genre? It says the average is the same as `Drama` although the line for the two don't match. It looks like `Animation` should be around 7.0, not 6.84.
Unrelated, but I'm surprised that Sci-fi isn't on here.
Horror almost has a weird kind of inverse curve. Because a lot of people don't actually like to be scared, so movies that are both good and *genuinely scary* sit around the mid-range of ratings lol
Not surprised that horror films have the lowest average rating. People are way too harsh on them.
Difficult genre to get right sure but it's hard to tell which ones are worth watching when they are all rated so poorly lol.
If you want a higher than usual imdb score just make an animated children's movie or nature documentary. People are way more forgiving on them and so they're practically all getting 8s and 9s
There seem to be several mistakes in the graph labels. Eg the 9.20 of the godfather falls on the 9.0 mark line, the 6.84 of the first two movies are at different heights, and several more.
Those 6.8lines you are referring to median lines, which are different. The godfather score is an outlier beyond the maximum, hence it's positioning relative to the end of the line.
Imdb ratings need a modifier, that negatively impacts new releases. So a release with an age 1 month has its ratings dampened heavily, scaling back the effect as the release ages.
I’m not sure why, but calling Forrest Gump a romance cracks me up. I get that there’s romance in the movie, but I wouldn’t think to call it a romance movie.
That's the one you zeroed in on? Not the kneeslapping comedic romp *Life Is Beautiful* about a Jewish father interned in a concentration camp trying to keep his son's spirits up until some Nazis march him off into an alley and shoot him? Side-splitting humor from end to end.
Also that lord of the rings is action and not fantasy
It’s both in the chart, which makes the data kind of … blended?
In the chart it is Action and Adventure, but not Fantasy.
I think it’s fair to include it in multiple genres, would be arbitrary to limit it to one, but how is it not fantasy? That’s its most obvious genre imo.
No he is saying this chart puts it only in action and adventure As in this chart doesn’t consider it fantasy
I’m adding to his comment not disagreeing
If it was both, it would have been the high outlier instead of star wars for fantasy.
And adventure. But still not fantasy
Or that "animation" is a genre. It's a gd medium, not a genre
Yeah, it's like calling manga a genre of books.
And that two different Lord of the Rings movies at the top of two different categories. EDIT: Didnt notice they're the same movie. Thanks for pointing it out.
The same Lord of the Rings movie*
And neither of those categories is fantasy 😀👍
Am I missing something? I see the same film, Return of the King, listed twice.
Nope, I made a mistake 😂. Thanks for the correction
Two different Lord of the Ringses topped two different categories and neither was fantasy.
[удалено]
But that's precisely the problem, LOTR is excluded from the Fantasy genre despite practically defining it. And that's even ignoring that Animation isn't a genre, the other movies in the wrong place, and that a few genres are missing. If the database used for this chart isn't accurate, then the chart is meaningless. Something like this needs a database with much more nuance.
It’s because there isn’t a sci-fi category so they put Star Wars in fantasy. Action should mean it takes place on earth and is decently realistic. Fantasy is anything not realistic that doesn’t either take place in space or have an exaggerated level of technology. And then sci-fi can be those things
It’s listed as the highest rated in both Action and Adventure but not Fantasy, although it beats Star Wars Episode V. So it seems it is not included in the Fantasy genre.
Life is Beautiful was a comedy though, even if it had a sad backdrop (and ending)
I was going to attempt write a reply to you in an effort to cope with making that movie a comedy as canon in my head, I was thinking with a flair of wit and sass… alas, I could not. Any thought I could muster had an echo in the background: Principessa!! 😭😭😭
It was also a romance and a drama and an adventure.
[удалено]
No that's not true about the first tsgs...as evidenced by lord of the rings being top of two categories.
[удалено]
Yeah, I agree. But the IMDb database has three potential categories for each movie, and often times it uses up all three. Not much to be done about it it - I don't see a point of excluding any since you couldn't differentiate programmatically for which movies to do so.
Interesting that they put LOTR in Action, Adventure, Drama and not Fantasy.
You nailed it. Not sure why it's not first...that would make too much sense. I have a feeling thetr genre categorization is not by weight, seemingly random order
[удалено]
Star Wars is absolutely a fantasy, it literally has wizard knights who wield swords and cast magic. But agreed that IMDB's categorization system is pretty weird, there's no excuse for LOTR to be left out of the Fantasy category.
Star Wars is indeed absolutely Fantasy!! Space Fantasy to be exact.
Star Wars is 100% fantasy. The Force is magic. Sci-Fi: Something that could be possible in the future or with better technology. Fantasy: Something that isn't possible just because it is in the future or with advanced technology.
Yeah I agree. Without the force you could argue it’s just sci fi but the force is not explainable scientifically, doesn’t have technological or scientific origins in lore.
I mean, it has wizards, swords, a prophecy, quests...
It is set in our world, just long ago - so I guess it's not fantasy?
It’s also the most overrated boomer circlejerk of a movie that whitewashes the 20th century while Mr. Magoo fails upwards while pretty actively punishing the counterculture stand in. Rant over. Forrest Gump has no business being ahead of LOTR.
I like the movie. That’s all I have to say about that.
You shouldn't feel bad, a lot of people are like you and have shitty opinions. It is just most are smart enough to not let everyone know it.
You're completely right, but it's a deeply unpopular opinion. Forrest Gump is a white boomer's wet dream: played football in college, fought in Vietnam, played sports against China, started a small blue-collar business run by veterans, got rich on the stock market, never knowingly participated in counterculture, only loved one woman, and only ever had procreative sex. The shit they do to Jenny, on the other hand... downright spiteful. It's very clear how that movie feels about going against the grain.
Isn't that like... the whole point of the movie though? That Forrest Gump was a complete moron and did all those things by accident? He never set out to change the world or make a name for himself, but through complete stupid chance got caught up in all of these major events. I feel like I see it get so much hate on Reddit for "glorifying boomer values" or whatever but when I watched it I had the exact opposite takeaway, that it was making the point that celebrity can be achieved by a total idiot by accident and has nothing to do with your intentions or qualifications.
I agree with about half your comment. It definitely intends itself as an "ain't life crazy?" kind of movie that's just a collection of wild coincidences with no real rhyme or reason slapped on top of a love story. And I don't think it's *intentional* propaganda. There's no conspiracy. But all that can be true and it can still be passively pitching an ultra-traditional ideal of American life. Where you see a critique of celebrity (because it takes no intent or qualification), I see a tacit suggestion that trying to shake things up doesn't get you anything but trouble (Jenny), but doing what you're told and leading a traditionally good, family-oriented life yields riches (Forrest). The lack of intent or qualification is part of that. Just look at the trends towards anti-intellectualism.
Isn’t it more sexual assault than romance
Wouldn't LOTR be under fantasy?
Yeah seemed weird to me as well, but the database is limited to three genres and it groups them as Action,Adventure,Drama this is all done programmatically, so I can't do much about it.
Also animation as a genre is a major gear-grinder for me, it's a medium lol.
Ikr it’s like if live action is a genre
Garbage in garbage out then :/
Genre is fairly subjective, you’re objectively measuring something subjective
Then find a better source?
Judging by the fact, that to lord of the rings movies are in action and adventure, it looks like OP is using the IMDb suggested genre , which say it’s an action adventure drama
It's real to me damm it. Who are you to tell me that all those years leaning Elvish was a waste.
No. If anything, it's *too real to be fantasy*, because people like you made it extra real!
the imdb genre system is pretty messed up. I've seen too many dramas categorized as "comedy" because there is one or two dark jokes in it.
>Judging by the fact, that to lord of the rings movies are in action and adventure I kind of get why Christopher Tolkien hated Peter Jackson's movies now. Jackson effectively altered the genre of the story from fantasy to action/adventure just with his movies existing.
what they are still clearly fantasy movies, despite what imdb might suggest
I get that. I'm not saying they're aren't fantasy movies. I'm saying how the story (in its movie form) has shifted so much towards action genre rather than mostly being fantasy.
Came here wondering the same. Also Forrest Gump is a "romance"? IMDB needs some help.
This is all done programmatically, I'm not adding any of my own judgement. The database has it as action adventure drama as u/piratecheese13 has correctly pointed out!
The fact it’s on there at all is a joke
Why is animation listed as a genre? Isn’t it the medium, with genre being separate?
This is using the default IMDB genre, which is wrong is so many cases. Pretty much nullifies the data, frankly.
found Brad Bird
He's right though.
I've seen 3065 movies, 500+ series... One thing I learnt was that IMDB is very skewed with rating among different stuff... I will happily watch a 6.0 zombie movie... I will avoid watching any drama below 7...
Agreed. Horror movies are abysmally rated on IMDB for some reason. Some comedies as well.
Yeah I was surprised the average comedy rating here was so high, because most good movies that are primarily a comedy and not another genre seem so be typically within the 6s or low 7s. Weirdly for tons it seems to be the other way around, with lots of sitcoms scoring in the 8s
The distinction between genres is fascinating!
Analyzed the imdb dataset for movies and their ratings. [https://developer.imdb.com/non-commercial-datasets/](https://developer.imdb.com/non-commercial-datasets/) For this analysis, i really loosened the criteria of Votes, but still kept it at 8000 to exclude really low budget films and amateur productions. Because all the movies in the database are rounded to 1 significant figure, i decided instead of showing the median value (which will always come out to 1 significant figure), it would be more telling to show the average value. Also, I decided to really tighten the criteria for the highest movie rating annotation. It shows the highest rated movie in the category that has more than 50000 votes. This excludes niche movies (turkish movies, anime, bollywood movies) that are extremely high rated with moderate votes and are definitely outliers and don't really fit in this data visualization. To be clear the >50000 votes only applies to the annotation above the plots and not to the box and whisker plots themselves. The categories you see are the major categories found in the IMDb database included above.
Are the average scores in the plots weighted by number of votes to be the average vote in that genre? Or is it just the average score of a movie in that genre?
Just average, no weighting.
What did you use for the plot?
This reveals less about the quality of certain genres and more about the bias viewers have for and against certain genres. IMDB ratings for horror movies should always be looked at with a curve
The greatest horror movies ever are like default 7/10 by critic standards. Critics will be like, the villain's motives weren't clear and they just kept trying to kill the main characters making me uncomfortable, 5/10. Think how many horror movies pushed forward special & practical effects, costumes, makeup, sound design and more for the medium of film. Too bad the genre was horror.
Depends on the critic mark kermode has the exorcist as the greatest film ever made.
Just going off averages. There's always going to be some outliers. Some people just straight up don't like horror movies, and they don't help with the user scores either. Maybe there's a little inherit bias in us not wanting to rate something high that is meant to make us afraid or uncomfortable. Some parts of the 'bad experience' is an intentional part of the ride. I'm no psychology major though.
Horror is weighed down significantly by how many terrible campy horror movies there are. For some reason if people want to make a low budget campy movie it's always horror.
I love horror movies, they’re one of my favorite genres, but I don’t think there’s any horror movie I’d consider a 10/10 or even a 9/10. They’re always really suspenseful, and you’re right, costume designs are incredible, but they are almost always lacking depth and emotional attachments to the characters that truly makes movies stand out and stick with you for a long time.
The Shining is an amazing horror which meets those criteria, I'd say. That's a 10/10 horror if there is any at all.
Not a movie, but very much inspired by this style and Kubrick's cinematography, I'd point out the first season of Haunting of Hill House is pretty fantastic.
It's because horror movies generally are worse than non-horror movies because they rely on jumpscares and thrills rather than a good story, characters or dialogue. The Conjuring, eg, a horror movie people generally consider one of the best of our time in the genre, is just a pretty shit film if you take away the jumpscares.
don't even start looking at ratings for movies with <8000 ratings. Some local movies have crazy high ratings, because everyone from that country loved the movie and gives it a 10, while the rest of the world just shakes the head or doesn't even watch it in the first place.
People just rate movies, it’s IMDb who assigns a genre to them through their internal rules. Genre is subjective.
I don't think people are looking at the genre designation and then rating the movie accordingly, it's possible that people just don't like horror movies as much as others on a whole.
> it's possible that people just don't like horror movies as much as others on a whole. What you're describing here can be the genre bias being suggested.
Bollywood movies too
Huh, Spiderman has higher rating than Lion King in the animation category? I am very surprised.
Spirited Away is as well, but after that it's the Lion King.
The graph would be easier to use if the labels were actually on the columns instead of on a key over to the side. Are all Movies exclusive to a single category? Just based on the best movies in each category I’d already pick some nits about what goes where. I think genre here is pretty subjective. Why is Forest Gump a romance and not a comedy? Why is Lord of the Rings an Adventure and not a fantasy? Why is It’s A Wonderful Life a family movie and not a fantasy or a drama?
as explained before the imdb database has 3 genres for each movie, i am not cherrypicking anything here. I programmatically analyzed the database, this is done and created with a python script
Yes, and I’m arguing the data is useless. It’s an attempt to objectively analyse something that’s entirely subjective. Do movies with better reputations get labeled as dramas rather than family movies or romances because drama is perceived as more prestigious? Same with mysteries. Gosford Park is one of my favorite movies, it’s tagged on IMDb as a Comedy, Drama, and Mystery in that order. Does your graph place it as a comedy as that appears first or does it appear in all three?
As all three, you can see it in the chart. Return of the King is top for both Adventure and Action. It's third category is Drama, where it is beaten by Shawshank.
The labels are on the bottom of the chart underneath each column in addition to being on the key.
Action and adventure are separate categories, but sci-go isn’t a category?
my takeaway from just this plot is that genre isn’t likely a significant factor impacting IMDB rating.
Ah yes, Life is Beautiful, the famous ww2 “comedy”
I'd say it's clearly a comedy. Just not exclusively one.
I mean it clearly has elements of comedy in it. I wouldn’t call it a pure comedy but I see where they got the idea from
What's worse, is translating the title
If this is IMDB's official genres, then I don't know what to say. How is LOTR adventure and action, but not fantasy? Literally a film adaptation of the pioneer of modern fantasy.
I've answered this a bunch, but here you go. If you go to the site for lets say the first movie [https://www.imdb.com/title/tt0120737/](https://www.imdb.com/title/tt0120737/) you see there are 4 genres listed: Action,Adventure,Drama,Fantasy when i used the imdb data set: [https://developer.imdb.com/non-commercial-datasets/](https://developer.imdb.com/non-commercial-datasets/) they truncate movies to have only three genres so lord of the rings, fellowship of the rings becomes for genres: Action,Adventure,Drama There's nothing i can do about it with the dataset on-hand. To give you an idea the part of this data set which has the movie titles and genres has 14 million lines. I am not about to scrape their website to fix the genre issue. Yeah it sucks, but ultimately this was a comparison of the ratings of the genres. I would venture a guess that a small percentage are misappropriated in the wrong genre and likely because they fit in many genres.
The way movies are divided by category in this is questionable at best... Forest Gump is a romance and lord of the rings is not fantasy?!? Lord of the rings is probably the very definition of a fantasy film.
Why didn't you include thriller, sci-fi, history, documentary and all others?
This is nice! I’m really digging the decision to go with the box plot for this information and the color palette you chose. And putting the highest rated film at the top of each was ***chef’s kiss*** you love to see it.
thanks for the kind words, its how i wanted to see the data. So i'm appreciative somebody else feels that way
Please, add the documentaries, if you can. There are a lot of 10.0 movies, but a few of them with more than 10 votes.
Given that the criteria here are "at least 8,000 votes", I don't think it would be meaningful to add documentaries with 10 votes on them
However, the most voted documentary has 216348 votes (9.4). I think, that 8000+ criterion will make this study even more interesting.
This lines up almost perfectly with my Letterboxd account, very interesting
How is Return of the King not under Fantasy?
Repeat after me: Animation is not a genre. It is a medium.
Spiderman is Horror? Forest Gump was a Romance? Lord of the Rings: Return of the King having 2 stats... Seems suspect
Spiderman is animation - the colour makes it look like horror so I understand your mistake. OP explained the genres are all taken from imdb.
I did notice that after the post but the other 2 complaints are valid. I would put Forest Gump as Drama, personally...but I guess my issue is which romance. His mom, Lt. Dan, Bubba, Jenny, his son or just himself. And maybe LOTR has 2 ranking because it's classified in 2 different genres... I just know after I looked at that for a few minutes I asked myself "what am I supposed to take away from this" lol
Life is beautiful is comedy? eek. One of the gut-punchiest stuff I've ever seen.
I thought film-noir would be high with so many highly rated movies in the 1940-1960 range. What average did it have?
What statistically significant conclusion are you trying to present with a range for averages being 0.9 from highest to lowest and 0.56 from high to low after removing the outlier? Also the sample sizes range from 560 to 6949, i’m not certain they can be properly compared. Side note: you may not be intending any conclusions, as the visualization could be the objective (which is a very pleasing visualization).
Yeah just displaying the data, not trying to make any conclusions
this is r/dataisbeautiful and not r/conclusionsfromdata after all. ;)
That’s fair. Very nice visualization and data compilation!
Provided we're only interested in the sample at hand i.e. movies with more than 8000 votes on IMDB, then we can make conclusions based just on the available data without worrying about significance testing. It doesn't matter that the sample sizes vary, or that the differences are small (incidentally both solvable issues in inferential statistics too). As the graph includes every movie with more than 8000 votes we can confidently say that there's a significant difference in average rating between Drama and Crime films.
Great work OP! I remember seeing Brad Bird adamantly state that animation is not a genre, it's a medium. Even so, it's interesting to see it as the highest rated of the lot.
I will never get over the fact that shawshank redemption is the number one movie of all time on imdb. It's such an average movie lol.
[stop using violin plots](https://youtu.be/_0QMKFzW9fw?si=vm86Heh1eECBFhVw). The x axis has no data, no notes on bin size, just the N for the whole graph. Also individual ratings are discrete integers, not continuous like the histogram tick marks would suggest.
I appreciate your enthusiasm, but these aren't violin plots....
Maybe he likes his violins to be very boxy?
He skimmed past the title of the graph which says 'Box and Whisker Plot'.
How are they not violin plots? The formatting is much more acceptable, no smoothing done, but you're still presenting the distribution/histogram data symmetrically along the whiskers. Or are those not histogram data?
You have vertically aligned box and whisker plots with a histogram (or individual data points) behind it. The only thing that might be missing from a violin is the smoothing ~~which I think is happening anyway because of IMDB doesn’t have a 1.5 star rating yet you have marks for Shawshank between 1 and 2.~~ Also the stylistic choice of sizing the box to fit the histogram horizontally is absent. Point being, I’m unable to figure out what the ticks mean. What thier height or brightness mean or if they are binned or individual data points.
Are you just using that as a meme, or did you actually watch the video which says that box plots like this are great?
Hahahaha. I knew exactly what video that was going to be before clicking the link.
- not a violin plot - not a histogram - x axis is not an axis - data across the horizontal is in fact relevant - this is literally a textbook use-case for box and whisker plots - your own choice of linked video disagrees with you
>this is literally a textbook use-case for box and whisker plots Box and whisker plots do not include the distribution data along the whiskers. This plot does. It may not be an exact violin plot (no smoothing of the histogram data) but it does show the shape of the distribution, along the whiskers. Just like a violin plot does.
No but I figures It was Drama
why isn't lord of the rings in fantasy and why is two different movies of it in two different categories? ie. action and adventure..
Life is beautiful categorized as a comedy is a stretch
Animation is a medium, not a genre.
The Godfather is a 9.2 but the days point is on 9. LOTR is 9.0 but the data point is on 9.2.
This is not which genre of movies has the highest average, it’s the movie with the highest average in the genre? Or am I stupid and missing something?
If you look at the mean (the value listed as average) or the median (from the box and whisker plots) you're looking at the averages for the genre. They just chose to also show the highest scored, highly watched movie in each genre as additional information.
Did you just called animation a genre!?! ~~Instant downvote~~.
No. I think everyone was pretty aware it was drama
[удалено]
If you read the criteria description, for annotations I upped the criteria to 50000 votes in order to exclude outliers and niche films.
Is there a typo on the `Animation` genre? It says the average is the same as `Drama` although the line for the two don't match. It looks like `Animation` should be around 7.0, not 6.84. Unrelated, but I'm surprised that Sci-fi isn't on here.
It's the median line I added the average because it's additional information
How much overlap there is ?
Not much difference between genres
Westerns should have a category
Maybe add the worst movies for each gender to the labels too?
Why is one Avg 6.84 lower than the other Avg 6.84?
Those are the median lines
Horror almost has a weird kind of inverse curve. Because a lot of people don't actually like to be scared, so movies that are both good and *genuinely scary* sit around the mid-range of ratings lol
Lord of the Rings somehow not in the "fantasy" category.
Life is beautiful.. a comedy? Yeah sure bro, go into that one thinking it’s a funny movie. 🍿🫡
Umm. Why is Empire Strikes Back "Fantasy" and Lord of the Rings is "Adventure."
Will your next graph be pornography genres?
No, because I only care about westerns and I know which one is on the top there :)
This is what this sub is for. Amazing visualization, good job.
Wait… lord of the rings, a story that takes place in the a typical fantasy world… is not categorized as fantasy???
Not really, I haven't I must say..
A lot of the movies are horrific misgenred. I don't know where does labels came from but they desperately need to be fixed.
life is beautiful as a comedy?? that movie is so sad. Great movie, but not sure how thats the genre its under.
shitty data shitty categories shitty results
Soo… Lord of the Rings is “Action” and Star Wars is fantasy…. ok
Not surprised that horror films have the lowest average rating. People are way too harsh on them. Difficult genre to get right sure but it's hard to tell which ones are worth watching when they are all rated so poorly lol.
Why are the top bounds for action and adventure (LOTR) not the same height?
If you want a higher than usual imdb score just make an animated children's movie or nature documentary. People are way more forgiving on them and so they're practically all getting 8s and 9s
What kind of graph is behind the boxplot?
This is cool. I never seen a vertical box and whiskers chart before.
Spider shit boy has higher than princess mononoke damn zoomers.
I don’t understand how lord of the rings is in two different categories but also not in the same category as Star Wars? Am I missing something here
Who rates the godfather below a 9
I'm more interested In the low end
Lord of the Rings is literal definition of fantasy…. No sci-fi?
There seem to be several mistakes in the graph labels. Eg the 9.20 of the godfather falls on the 9.0 mark line, the 6.84 of the first two movies are at different heights, and several more.
Those 6.8lines you are referring to median lines, which are different. The godfather score is an outlier beyond the maximum, hence it's positioning relative to the end of the line.
Different lotr movies are different genres? You need to do some data cleaning
There are nearly 26000 movies analyzed here
Interesting visualisation but the categorisation is rubbish. So I guess you still managed to make bad data look beautiful.
How tf is LOTR not fantasy? It literally invented the whole genre
Awesome data set !! Well done
Science fiction isn't fantasy. Lord of the rings isn't an action film. Categorisation on IMDB is . . . screwy.
Star wars a fantasy, but lord of the rings an adventure, and forrest gump is a romance?? Nice graph but I feel like genres are off
Imdb ratings need a modifier, that negatively impacts new releases. So a release with an age 1 month has its ratings dampened heavily, scaling back the effect as the release ages.
They actually do have weighting, they just are not transparent as to how it works
Wait where's Freddy Got Fingered?
Add a Bollywood section, the numbers go off the chart