Sequenz No.3 Rex tremendae & Sequenz No.5 Confutatis
posted by max on November 18, 2007 at 08:40:39 PM
Good evening pals,
It is with much pride I present to you the new YTMND starbar tonight. Not only does it provide some new functionality, but it also represents a great technical achievement.
The new starbar allows multiple score views, voting feedback, site-wide favorite recognition, single click adding/removing of favorites as well as a spectacular feat; enabling collaborative filtering vote prediction. Read more for a long winded description and explanation.
The original "vote bar" was one of the oldest pieces of YTMND. It was created from stolen netflix code for a project that pre-dates YTMND by almost a year and was basically an archaic piece of junk. I had created a program to generate colored star bars and the two fit well, so it became an integral part of YTMND.
It had some major technical issues attached to it though. One of which was the fact that a user's score had to be passed from the server at the time of the page loading, and since votebars show up randomly across pages, it ended up requiring an extra database query for each bar. This has changed completely so that votes load after the page loads changing up to hundreds of queries into one.
Another major issue is that it provided no feedback if there was a problem. From the user perspective, you voted and if something goes wrong on the back end, you had no idea. This has changed as well.
So the next couple weeks will be a "forced beta" to see how well everything holds up. Instead of a giant wall of text, I'll try to give an overview of what the new starbar is capable of.
It is with much pride I present to you the new YTMND starbar tonight. Not only does it provide some new functionality, but it also represents a great technical achievement.
The new starbar allows multiple score views, voting feedback, site-wide favorite recognition, single click adding/removing of favorites as well as a spectacular feat; enabling collaborative filtering vote prediction. Read more for a long winded description and explanation.
The original "vote bar" was one of the oldest pieces of YTMND. It was created from stolen netflix code for a project that pre-dates YTMND by almost a year and was basically an archaic piece of junk. I had created a program to generate colored star bars and the two fit well, so it became an integral part of YTMND.
It had some major technical issues attached to it though. One of which was the fact that a user's score had to be passed from the server at the time of the page loading, and since votebars show up randomly across pages, it ended up requiring an extra database query for each bar. This has changed completely so that votes load after the page loads changing up to hundreds of queries into one.
Another major issue is that it provided no feedback if there was a problem. From the user perspective, you voted and if something goes wrong on the back end, you had no idea. This has changed as well.
So the next couple weeks will be a "forced beta" to see how well everything holds up. Instead of a giant wall of text, I'll try to give an overview of what the new starbar is capable of.
New Features
- You are trying to vote on or favorite a site you have not seen.
- You are trying to vote on or favorite a site you own.
- Your authentication has expired.
- An internal error (which should almost never happen).
Multiple score views
-
This nifty feature allows us to combine two starbars into one. The prime example of this is viewing a starbar for a site you've already voted on. In the old setup, you could only view your score and had to visit the site's profile page to see what the overall score of the site was. Now when you vote on a site; the votebar figures it out and displays both the site's score as well as your rating. We mix the color of the two bar types where they overlap. Your vote is blue, site score is red, so where they overlap the color of the bar will be purple.
Examples:
![]() | You vote 3 on a site with a score of 4.5 |
![]() | You vote 5 on a site with a score of 1.0 |
![]() | You vote 2 on a site with a score of 2.5 |
I haven't finalized the exact colors yet, but this should give you an idea of how this feature functions. I think at first it may be hard to absorb but over time it will become an integral piece of YTMND.
Extended Security
-
Previously when a user would vote, it would hit a REST interface that at first did nothing but check if the user was logged in. This meant people could link to the interface in an iframe on their websites and cause people to unknowingly vote. After people began exploiting this, the user id number was required for the vote to register. Once this was enabled people began writing scripts to automatically vote and even with a user_id it was possible to make targeted users unknowingly vote on sites. The new starbar works on the same principals as the comment voting interface. We now generate a cipher specific to each user using a rolling salt that changes every few minutes. This enables us to ensure (for the most part) that users will not unknowingly vote on sites as well as making vote scripts and bots more difficult. One of the unfavorable effects of this new system is that after around 20 minutes, pages expire and you will have to refresh them in order to vote. The starbar will be notify you if this happens.
Voting feedback.
-
With the old votebar, voting was sort of "click-and-pray" in that you had no idea if your vote was registering or not. Due to the massive amount of vote lookups, user votes were loaded from a slave database, and if database replication failed, it would look as if your votes weren't registering even if they had. While we are still going to use a slave database for vote lookups, if a vote fails, you will get a message as to why. Some instances of this are:
Site-wide favorite recognition and vote loading
-
The starbar now checks if you've got each YTMND on your favorites list across the entire site. Once it is out of beta, any remaining areas where your votes aren't shown or voting is not allowed (such as in search results) will be updated to show your vote/allow you to vote.
Quick favorite and un-favorite
-
A new starbar addon which is appended to the end of the starbar in some places (currently only on the site itself and the site profile) allows you to add a favorite to your list with a single click. Sites that are currently on your favorite list will allow you to "unfavorite" them when you hover over their starbar.
Additionally, adding favorites has been changed so that when you add a site to your favorites list, it will automatically vote five on that site. When you "unfavorite" a site, the vote will remain. Once the starbar is out of beta, all previous favorites will be updated to five-star votes (if they were fav'd without voting, a new five-star vote will be added).
Holy shit: Collaborative filtering.
-
This is a feature that has been something I've devoted some free time to for over a year. For those of you that don't know what collaborative filtering is, it's when you take a massive amount of data on who likes what and use it to figure out what each user might like. Simply put; collaborative filtering allows us to predict what you will rate a site you haven't even seen based on how you've voted in the past.
Sadly, the majority of you have no idea what an amazingly complicated feat this is to accomplish. The small amount of you that have dealt with this type of system in college or business dealings will understand how awesome it is that we are launching this on such a limited hardware platform.
<technical jargon>
-
I'll try to give an idea of how I accomplished this for the two or three of you who are interested in the technical side of this. First we gather the over twenty million YTMND votes into the memory of a C++ program I wrote which uses Simon Funk's SVD algorithm to calculate "feature" scores for each user and site. This consists of hundreds of billions of calculations and with the full YTMND data set, it takes roughly 70 minutes using 100% of a 2.4ghz AMD Opteron and around 500mb of memory. At this point we export the feature data to a SQL VIEW which allows us to calculate the prediction by performing (site features * corresponding user features) for any given user+site combination. We do this instead of storing each prediction because it would require (users*sites) rows (currently around 180 billion) in a database, most of which will never be accessed.
Due to the nature of the algorithm, a single vote change or addition/removal recurses the entire tree. For instance, if a user votes on a site, that user's features have to be recalculated based on the new information and that site's features have to be recalculated as well; any site that user has voted on has to have its features updated based on the new user features scores as well as anyone who has voted on the site and any sites they've voted on etc. This means that the features can not be updated incrementally and we have to recalculate feature data in full every time. This can be done every few days or so and then imported into SQL.
Since we have to calculate predictions on the fly, doing straight top-N recommendations isn't very easy as it requires a massive amount of calculations (sites*features (currently around 20 million)) and then a bubble sort on over 500,000 numbers for a single user. While it's doable, it doesn't really have any place in a production environment at the moment.
We can create an accuracy score for each user by getting a list of their entire vote history and then comparing each vote to it's corresponding prediction. So if I vote 5 on a site and the prediction was 4.5, it was off by 0.5. When we average the remainder on all votes, we can create a score from zero to five based on how far off the predictions are on average. So to summarize it's one gigantic hack.
</technical jargon>
How this will affect you
-
The system is very specific to your voting history, this means while some users will get really good predictions, others will get awful predictions. There are two major factors on how accurate your predictions will be: how many votes you've made in the past, how many votes have been made on each site you get a prediction for.
Simply put, the closer your votes are to how you actually feel a site should be rated, the better your predictions will be. If you make a lot of five-star votes in the hopes that other people will return the favor, your predictions will be bad. If you down-vote out of spite, your predictions will be bad. If you have made very few votes or the site you are getting a prediction for has very few votes, it's likely the prediction will be off. So if you haven't been voting honestly, this system will be almost totally useless.
Depending on how busy the system is, it may take a while to calculate predictions for you, this means that your previous votes and your predictions may take a while to show up on votebars. Depending on how badly YTMND shits the bed over the next few days, I may make predictions on option you can turn off if you feel it isn't useful for you. I will also add something for you to see how accurate your predictions will be on average once we are out of beta.
How this will be used
-
Since the system is so heavily based on vote history, and predictions are only updated at 24 hour intervals at most, this is really not very useful on the front page since the majority of sites there are relatively new and predictions would frequently be inaccurate. The main use of this feature is for browsing large lists of sites like on user profile pages or search results. Predictions will allow you to quickly figure out what sites you may enjoy more than others. The new color for predictions will be gold, which will mix with the current color of scores which is red so the resulting overlap color will be orange. You will not get predictions for sites that you've voted on/favorited already.
Examples:
![]() | A site with a score of 4.5 with a custom prediction of 3.0 |
![]() | A site with a score of 1.0 with a custom prediction of 5.0 |
![]() | A site with a score of 2.5 with a custom prediction of 2.0 |
As you can see, the prediction mix color is subtle, this was done on purpose, as we don't want to influence peoples decisions so much as help them sift through a lot of garbage.
If you encounter any problems or bugs, post a comment here and over the next day I'll try to hammer out any problems. Moving on...
Much as I expected, the limited number of you technically capable of understanding the API had little to no interest in using it. There have been three entries to the API Contest so at this point in time, everyone who entered will get a "prize". You still have a couple weeks to enter, so come up with an idea and enter the contest so I don't feel like writing the API was a total waste.
I know YTMND is suffering from "broken-window" syndrome, and I've been working on it a lot, I've been making a lot of small improvements both on the back and front end of the site. Obviously the main issue is now moderation, which is going to be the primary focus as I am now actively going over the technical design and coming up with a system that will be far more self-sufficient than the current "moderators clean up after users" setup.
Hurray for gigantic news posts.
Add a comment
^ Nice, I really like that layout you made. If I can add any input, perhaps on a users profile page it can tell how many users have subscribed to this user (similar to a sites comment page, where it shows how many users have fav'd the site). Its just a suggestion.
To be honest, I have actually been hoping for a system of subscribing to our favorite users. Thanks a ton for working on it!
At first I thought collaborative filtering would protect me from viewing sites I don't like, but all it's done is make me curious. For example, I didn't bother looking at that site called "^_^" until collaborative filtering predicted I would hate it.
I think this is going to make YTMND a lot more fun. Good work Max! You've earned those probably hundreds of dollars I've spent on you.
Yet another goofy *ss feature that freaks me the hell out and I hate which I won't be able to live without in a month. Also the technical jargon was pretty interesting wasn't an algorithm I was familiar with but seems pretty boring just like every other algorithm ever created and I'll probably find myself reading up on it tomorrow.
That was the point; every feature added ends up in users contradicting themselves. Think how many people pissed and moaned when the new layout was implemented. Now think if Max were to revert back to the old layout tomorrow...whom do you think would be the first to piss and moan? You guessed it...vegetarian sharks who just happen to be showgirls in Vegas yearning to become accountants.
max, the vote/rating starbar is fine, but please get rid of the other. I hate being told how much i'm supposed to like a site before i view it. It ruins what would otherwise be a site i might like. Playing with something so fundamental as the readtitle-click-view experience like this is really damaging. i mean, critical hit stuff here. I'll probably have to view ytmnds while logged out if you keep this.
max, could you please make an option to add a note immediately when you favorite a site. Something like when you click "add to favorites" in a site profile a little typing box could appear, similar to when you click "reply" on a comment. Not sure how that would work when you click "fav" while you're watching a site though. Anyway, at the moment the only way to add a note to your new favorite sites would be to scroll to the bottom of your favorites list and click "add note."
Just saying that now being able to FAV a site while you're watching it, and seeing which sites you've fav'd just by looking at the home page seems to encourage more people to add sites to their favorites more often. At least it's having that effect on me. Though, I kind of think that fav'd sites mean more when there's a note added to them.
seriously. how about you just print a juicy part of one of your gay algorithims with ytmnd faded encompassing the formula in the background. it would scream geek and would guarantee that every frat boy and his mother would kick our asses and rape us dead. you should get on that. no really, get on it.
BillyZane, are you seriously that afraid to wear a YTMND shirt? For god's sake, I was homecoming king at my school, and I f*cking wore a commie hat to school everyday. Most frat boys probably have no idea what YTMND is, and if they do, why would they care? I mean, look at Family Guy, a very popular show, ripe with Star Wars references, yet many people (even frat boys) watch it. Now my ranting has caused this post to become far too long. OFFICIAL YTMND T-SHIRTS=GOLD. Also, c*cks.
Looking at one's own favorites page, it now displays a "FAV!" starbar for every single entry (which is useless), instead of showing the average rating of the site. The number of the average rating is still there, but I'd see it more useful having it there in starbar form rather than being reminded that every site on my own favorites page is, indeed, a favorite
Awesome! The collaborative filtering feature is great. One thing I'd like to see is the ability somewhere on the "Sites" page to click a link that randomly picks 100 sites that it thinks you'll either like (3-5 stars) or dislike (1-3 stars) but haven't yet voted on. Unfortunately, this would probably make it a lot easier for douchebags who vote 1 or 5 on everything, but it would be an awesome feature for us real YTMND addicts.
Also, I'm a bit torn about the Favorites thing as well. I use it to track worthwhile sites, not necessarily ones that deserve 5's. Even a 3 or 4 could be a favorite site but not worthy of a 5 in my opinion. I realize bookmarks would be a better solution in most cases, but I typically favorite things so I can log in anywhere and show my friends. If favoriting something is automatically going to 5 it, could we get a page in our profile that lists sites we've voted on, sorting them by number of stars?
"So to summarize it's one gigantic hack. "
nice meng heh
a suggestion for voting if you will, perhaps add a vote count per user, in a set interval, which they can max out at (say 30 per hour)
how many YTMND's can one user view in a given hour?
this way you keep the rowset down to user but limit voting abuse.
good job on everything. i was wondering when the favorite system would be updated





