Sequenz No.3 Rex tremendae & Sequenz No.5 Confutatis
posted by max on November 18, 2007 at 08:40:39 PM
Good evening pals,
It is with much pride I present to you the new YTMND starbar tonight. Not only does it provide some new functionality, but it also represents a great technical achievement.
The new starbar allows multiple score views, voting feedback, site-wide favorite recognition, single click adding/removing of favorites as well as a spectacular feat; enabling collaborative filtering vote prediction. Read more for a long winded description and explanation.
The original "vote bar" was one of the oldest pieces of YTMND. It was created from stolen netflix code for a project that pre-dates YTMND by almost a year and was basically an archaic piece of junk. I had created a program to generate colored star bars and the two fit well, so it became an integral part of YTMND.
It had some major technical issues attached to it though. One of which was the fact that a user's score had to be passed from the server at the time of the page loading, and since votebars show up randomly across pages, it ended up requiring an extra database query for each bar. This has changed completely so that votes load after the page loads changing up to hundreds of queries into one.
Another major issue is that it provided no feedback if there was a problem. From the user perspective, you voted and if something goes wrong on the back end, you had no idea. This has changed as well.
So the next couple weeks will be a "forced beta" to see how well everything holds up. Instead of a giant wall of text, I'll try to give an overview of what the new starbar is capable of.
It is with much pride I present to you the new YTMND starbar tonight. Not only does it provide some new functionality, but it also represents a great technical achievement.
The new starbar allows multiple score views, voting feedback, site-wide favorite recognition, single click adding/removing of favorites as well as a spectacular feat; enabling collaborative filtering vote prediction. Read more for a long winded description and explanation.
The original "vote bar" was one of the oldest pieces of YTMND. It was created from stolen netflix code for a project that pre-dates YTMND by almost a year and was basically an archaic piece of junk. I had created a program to generate colored star bars and the two fit well, so it became an integral part of YTMND.
It had some major technical issues attached to it though. One of which was the fact that a user's score had to be passed from the server at the time of the page loading, and since votebars show up randomly across pages, it ended up requiring an extra database query for each bar. This has changed completely so that votes load after the page loads changing up to hundreds of queries into one.
Another major issue is that it provided no feedback if there was a problem. From the user perspective, you voted and if something goes wrong on the back end, you had no idea. This has changed as well.
So the next couple weeks will be a "forced beta" to see how well everything holds up. Instead of a giant wall of text, I'll try to give an overview of what the new starbar is capable of.
New Features
- You are trying to vote on or favorite a site you have not seen.
- You are trying to vote on or favorite a site you own.
- Your authentication has expired.
- An internal error (which should almost never happen).
Multiple score views
-
This nifty feature allows us to combine two starbars into one. The prime example of this is viewing a starbar for a site you've already voted on. In the old setup, you could only view your score and had to visit the site's profile page to see what the overall score of the site was. Now when you vote on a site; the votebar figures it out and displays both the site's score as well as your rating. We mix the color of the two bar types where they overlap. Your vote is blue, site score is red, so where they overlap the color of the bar will be purple.
Examples:
![]() | You vote 3 on a site with a score of 4.5 |
![]() | You vote 5 on a site with a score of 1.0 |
![]() | You vote 2 on a site with a score of 2.5 |
I haven't finalized the exact colors yet, but this should give you an idea of how this feature functions. I think at first it may be hard to absorb but over time it will become an integral piece of YTMND.
Extended Security
-
Previously when a user would vote, it would hit a REST interface that at first did nothing but check if the user was logged in. This meant people could link to the interface in an iframe on their websites and cause people to unknowingly vote. After people began exploiting this, the user id number was required for the vote to register. Once this was enabled people began writing scripts to automatically vote and even with a user_id it was possible to make targeted users unknowingly vote on sites. The new starbar works on the same principals as the comment voting interface. We now generate a cipher specific to each user using a rolling salt that changes every few minutes. This enables us to ensure (for the most part) that users will not unknowingly vote on sites as well as making vote scripts and bots more difficult. One of the unfavorable effects of this new system is that after around 20 minutes, pages expire and you will have to refresh them in order to vote. The starbar will be notify you if this happens.
Voting feedback.
-
With the old votebar, voting was sort of "click-and-pray" in that you had no idea if your vote was registering or not. Due to the massive amount of vote lookups, user votes were loaded from a slave database, and if database replication failed, it would look as if your votes weren't registering even if they had. While we are still going to use a slave database for vote lookups, if a vote fails, you will get a message as to why. Some instances of this are:
Site-wide favorite recognition and vote loading
-
The starbar now checks if you've got each YTMND on your favorites list across the entire site. Once it is out of beta, any remaining areas where your votes aren't shown or voting is not allowed (such as in search results) will be updated to show your vote/allow you to vote.
Quick favorite and un-favorite
-
A new starbar addon which is appended to the end of the starbar in some places (currently only on the site itself and the site profile) allows you to add a favorite to your list with a single click. Sites that are currently on your favorite list will allow you to "unfavorite" them when you hover over their starbar.
Additionally, adding favorites has been changed so that when you add a site to your favorites list, it will automatically vote five on that site. When you "unfavorite" a site, the vote will remain. Once the starbar is out of beta, all previous favorites will be updated to five-star votes (if they were fav'd without voting, a new five-star vote will be added).
Holy shit: Collaborative filtering.
-
This is a feature that has been something I've devoted some free time to for over a year. For those of you that don't know what collaborative filtering is, it's when you take a massive amount of data on who likes what and use it to figure out what each user might like. Simply put; collaborative filtering allows us to predict what you will rate a site you haven't even seen based on how you've voted in the past.
Sadly, the majority of you have no idea what an amazingly complicated feat this is to accomplish. The small amount of you that have dealt with this type of system in college or business dealings will understand how awesome it is that we are launching this on such a limited hardware platform.
<technical jargon>
-
I'll try to give an idea of how I accomplished this for the two or three of you who are interested in the technical side of this. First we gather the over twenty million YTMND votes into the memory of a C++ program I wrote which uses Simon Funk's SVD algorithm to calculate "feature" scores for each user and site. This consists of hundreds of billions of calculations and with the full YTMND data set, it takes roughly 70 minutes using 100% of a 2.4ghz AMD Opteron and around 500mb of memory. At this point we export the feature data to a SQL VIEW which allows us to calculate the prediction by performing (site features * corresponding user features) for any given user+site combination. We do this instead of storing each prediction because it would require (users*sites) rows (currently around 180 billion) in a database, most of which will never be accessed.
Due to the nature of the algorithm, a single vote change or addition/removal recurses the entire tree. For instance, if a user votes on a site, that user's features have to be recalculated based on the new information and that site's features have to be recalculated as well; any site that user has voted on has to have its features updated based on the new user features scores as well as anyone who has voted on the site and any sites they've voted on etc. This means that the features can not be updated incrementally and we have to recalculate feature data in full every time. This can be done every few days or so and then imported into SQL.
Since we have to calculate predictions on the fly, doing straight top-N recommendations isn't very easy as it requires a massive amount of calculations (sites*features (currently around 20 million)) and then a bubble sort on over 500,000 numbers for a single user. While it's doable, it doesn't really have any place in a production environment at the moment.
We can create an accuracy score for each user by getting a list of their entire vote history and then comparing each vote to it's corresponding prediction. So if I vote 5 on a site and the prediction was 4.5, it was off by 0.5. When we average the remainder on all votes, we can create a score from zero to five based on how far off the predictions are on average. So to summarize it's one gigantic hack.
</technical jargon>
How this will affect you
-
The system is very specific to your voting history, this means while some users will get really good predictions, others will get awful predictions. There are two major factors on how accurate your predictions will be: how many votes you've made in the past, how many votes have been made on each site you get a prediction for.
Simply put, the closer your votes are to how you actually feel a site should be rated, the better your predictions will be. If you make a lot of five-star votes in the hopes that other people will return the favor, your predictions will be bad. If you down-vote out of spite, your predictions will be bad. If you have made very few votes or the site you are getting a prediction for has very few votes, it's likely the prediction will be off. So if you haven't been voting honestly, this system will be almost totally useless.
Depending on how busy the system is, it may take a while to calculate predictions for you, this means that your previous votes and your predictions may take a while to show up on votebars. Depending on how badly YTMND shits the bed over the next few days, I may make predictions on option you can turn off if you feel it isn't useful for you. I will also add something for you to see how accurate your predictions will be on average once we are out of beta.
How this will be used
-
Since the system is so heavily based on vote history, and predictions are only updated at 24 hour intervals at most, this is really not very useful on the front page since the majority of sites there are relatively new and predictions would frequently be inaccurate. The main use of this feature is for browsing large lists of sites like on user profile pages or search results. Predictions will allow you to quickly figure out what sites you may enjoy more than others. The new color for predictions will be gold, which will mix with the current color of scores which is red so the resulting overlap color will be orange. You will not get predictions for sites that you've voted on/favorited already.
Examples:
![]() | A site with a score of 4.5 with a custom prediction of 3.0 |
![]() | A site with a score of 1.0 with a custom prediction of 5.0 |
![]() | A site with a score of 2.5 with a custom prediction of 2.0 |
As you can see, the prediction mix color is subtle, this was done on purpose, as we don't want to influence peoples decisions so much as help them sift through a lot of garbage.
If you encounter any problems or bugs, post a comment here and over the next day I'll try to hammer out any problems. Moving on...
Much as I expected, the limited number of you technically capable of understanding the API had little to no interest in using it. There have been three entries to the API Contest so at this point in time, everyone who entered will get a "prize". You still have a couple weeks to enter, so come up with an idea and enter the contest so I don't feel like writing the API was a total waste.
I know YTMND is suffering from "broken-window" syndrome, and I've been working on it a lot, I've been making a lot of small improvements both on the back and front end of the site. Obviously the main issue is now moderation, which is going to be the primary focus as I am now actively going over the technical design and coming up with a system that will be far more self-sufficient than the current "moderators clean up after users" setup.
Hurray for gigantic news posts.
Add a comment
Same here: I *never* see *any* stars, so I can't vote or even see what a site's score is without going to its profile page. I left-click and right-click where the stars should be, but there's just nothing there. I've tried Safari and Firefox. I'm rooting for ya Max, but for now the whole site's pretty unusable for me.
Awesome! The collaborative filtering feature is great. One thing I'd like to see is the ability somewhere on the "Sites" page to click a link that randomly picks 100 sites that it thinks you'll either like (3-5 stars) or dislike (1-3 stars) but haven't yet voted on. Unfortunately, this would probably make it a lot easier for douchebags who vote 1 or 5 on everything, but it would be an awesome feature for us real YTMND addicts.
Also, I'm a bit torn about the Favorites thing as well. I use it to track worthwhile sites, not necessarily ones that deserve 5's. Even a 3 or 4 could be a favorite site but not worthy of a 5 in my opinion. I realize bookmarks would be a better solution in most cases, but I typically favorite things so I can log in anywhere and show my friends. If favoriting something is automatically going to 5 it, could we get a page in our profile that lists sites we've voted on, sorting them by number of stars?
"So to summarize it's one gigantic hack. "
nice meng heh
a suggestion for voting if you will, perhaps add a vote count per user, in a set interval, which they can max out at (say 30 per hour)
how many YTMND's can one user view in a given hour?
this way you keep the rowset down to user but limit voting abuse.
good job on everything. i was wondering when the favorite system would be updated
I came up with this really great idea, but then saw some possible flaws with it so maybe it's not so great. Here it is anyway - One whole side of the front page be a box listing Users Who've Made Sites Today. Now I don't know how much the number of users who make sites would differ from number of sites, but it couldn't be more and might be a lot less. Have the 'today' represent 24 hours from last site made rather than having a daily cut off that would cause people to all post at the same time.
Have this box randomly rotate to feature everyone - kind of the way Worthwhile did. So each time you refresh the page there is a new random list of Users Who've Made A Site Today. A friendly request by you that people only post sites using one account per day would be enough for me to do that, certainly. Rather than stars perhaps there just either be some sort of icon by the name or not, like a green check mark, depending on how well the user's site / sites have been recieved
Now for the biggie: Instead of voting (Gasp!) have a Recommend This Site option. This would decide how the user is displayed in the new box. If the site is recommended enough it would display some sort of icon, as I said. If not, it simply doesn't. There would be no way to downvote someone. Either you think others will like the site or not - no big whoop. No bitterness, no mass alt downvoting, no trolls. And it would make people abusing alts to upvote themselves less of a factor...
So they used alts to get the stupid green checkmark icon - big whoop. People will be viewing these sites partly based on your past work - not on your willingness to exploit an honor system. Also, I guess you could do this and keep the voting system intact. Keep U&C, keep Top Rated, but use the Recommend This Site just to give some sort of heads up to vistor as to what name they should click on rather than a total crap shoot.
I think it's worth it because alt/buddy upvoting and troll/revenge downvoting have too much of an impact over what people see on YTMND. This new box and rating system would not only take away the need to cheat, it would render many of the oldest issues moot. And as a bonus it would lend itself to more of a profile based site, which you seem to mention in a lot of these news posts.
And a last thought on the Recommend This Site feature: You mentioned wanting to make it so your voting history would impact how much your vote counted. This feature might make this more possible. If you have bad taste then the impact of you clicking the Recommend button is lessened. Perhaps do a scale of 10. If your history of voting/recommending is exactly what the average is on each site your clicks are worth 10. If you are Fourest or The Punisher your clicks are worth 1, and so on
Again, abusing alts either way / just voting 1 or 5 / group upvoting - would have ZERO impact on the amount of time sites are featured in the new box. How the new box would impact the rest of the site is wide open, I think. It could not change it at all or you could redo the whole system if you wanted. --- I remember I said I had thought of some problems with this, but I've forgotten them, so pretend I didn't say that
Oh, and Featured Users is gay and you know it, but even that could be represented. Simply adjust the frequency (or odds, or whatever) that the pinkies show up in the new box. Not an obscene ratio compared to the rest of us second class citizens, but enough to keep ... whatever gay thing you are trying to accomplish by having featured users.
Certainly that is less complicated. But I like the idea of making the front page more user orientated rather than site. RC as is you see a title and it could be an all time great site by Nutnics or it could be a stupid screen shot news story with old fad music. Which is why people tend to avoid it because they assume they will see the good sites elsewhere. Or they only click on RC sites that have 5 stars lit up, which leads to vote whoring and alts. - Also, I realize what I said about rating the voters voting strength is a flawed, stupid idea, so ignore that bit
I think max is working on getting rid of the Featured Users box and replacing it with a box of sites recently made by your Favorite Users. This would most likely be much more effective than the current system. Unless your idea means you want to find underviewed sites by obscure users. Chances are the sites in Recommend This Site will probably be the same sites that are in U&C/Top Rated. Just keep browsing around YTMND until you find that undiscovered underviewed gem.
Voting is not going away, live with it. A recommendation system isn't a bad idea though, but if you give that power to everyone you'll get the same alt-abuse issues voting has. I still say, get rid of the featured users section, but keep featured users. Then give featured users and ONLY featured users the ability to recommend sites, and replace the featured users section with sites RECOMMENDED by featured users. That way, alt abuse is impossible, and it would allow the featured list to include everyone.
My thinking was that you take away the reason to alt vote. By expanding RC and having it rotate like the old Whorthwhile box -each site for a full day would make those who lack a moral compass less inclined to cheat - exposure would be there for every site "ir"regardless. And hopefully the Recommend feature would help the visitor avoid crap / noise sites by indicating which ones are well recieved. Sure FPA members would all recommend each others sites, but the % wouldn't be high enough to highlight it
Necronomicon, I'm sorry but the featured users have virtually the same sh*tty taste (per capita) as the non-featured users. Just go through the pink folks' sites: not just the ones that regularly produce front page stuff, but the whole lot of them. There's a sh*t ton of pure (often stolen) bullsh*t there.
Anyone else think the minimum rating needed to appear on the front page's top viewed should be raised from 2.5 to say, 3.75? If the idea was to keep crap sites that happened to be submitted at midnight or linked from elsewhere from showing up on the front page, it's not succeeding, because there are a lot of complete wastes of Internet that can manage above a 2.5.
It comes down to how most people want as little crap as possible on the front page. Someone might sponsor a bad site, or a featured user might make something lousy, but those sections are avoidable. Most people see the Top Viewed as too integral a part of the front page to be just a mere gauge of what people are looking at regardless of actual quality.
Atrahasis, the Top Viewed is self perpetuating. If it starts the day on Top Viewed it will usually stay on Top Viewed even if it's not in any other front page box. The only exceptions being sites that fall below 2.5. That's an insanely low score. 3.5 is a good mark. 3.75 is probably a little too high
I will assume we will be able to sort by predicted rating? Plus do you believe this 'Collaborative filtering', even with the light tint, will be very likely to invluence potential votes? An example being if someone sees the yellow stars a say 4 or five one will not grant that vote simply because in their eyes it has already been granted.





