Sequenz No.3 Rex tremendae & Sequenz No.5 Confutatis

Good evening pals,

It is with much pride I present to you the new YTMND starbar tonight. Not only does it provide some new functionality, but it also represents a great technical achievement.

The new starbar allows multiple score views, voting feedback, site-wide favorite recognition, single click adding/removing of favorites as well as a spectacular feat; enabling collaborative filtering vote prediction. Read more for a long winded description and explanation.

The original "vote bar" was one of the oldest pieces of YTMND. It was created from stolen netflix code for a project that pre-dates YTMND by almost a year and was basically an archaic piece of junk. I had created a program to generate colored star bars and the two fit well, so it became an integral part of YTMND.

It had some major technical issues attached to it though. One of which was the fact that a user's score had to be passed from the server at the time of the page loading, and since votebars show up randomly across pages, it ended up requiring an extra database query for each bar. This has changed completely so that votes load after the page loads changing up to hundreds of queries into one.

Another major issue is that it provided no feedback if there was a problem. From the user perspective, you voted and if something goes wrong on the back end, you had no idea. This has changed as well.

So the next couple weeks will be a "forced beta" to see how well everything holds up. Instead of a giant wall of text, I'll try to give an overview of what the new starbar is capable of.

New Features

    Multiple score views


      This nifty feature allows us to combine two starbars into one. The prime example of this is viewing a starbar for a site you've already voted on. In the old setup, you could only view your score and had to visit the site's profile page to see what the overall score of the site was. Now when you vote on a site; the votebar figures it out and displays both the site's score as well as your rating. We mix the color of the two bar types where they overlap. Your vote is blue, site score is red, so where they overlap the color of the bar will be purple.

      Examples:

      You vote 3 on a site with a score of 4.5
      You vote 5 on a site with a score of 1.0
      You vote 2 on a site with a score of 2.5

      I haven't finalized the exact colors yet, but this should give you an idea of how this feature functions. I think at first it may be hard to absorb but over time it will become an integral piece of YTMND.

    Extended Security

      Previously when a user would vote, it would hit a REST interface that at first did nothing but check if the user was logged in. This meant people could link to the interface in an iframe on their websites and cause people to unknowingly vote. After people began exploiting this, the user id number was required for the vote to register. Once this was enabled people began writing scripts to automatically vote and even with a user_id it was possible to make targeted users unknowingly vote on sites. The new starbar works on the same principals as the comment voting interface. We now generate a cipher specific to each user using a rolling salt that changes every few minutes. This enables us to ensure (for the most part) that users will not unknowingly vote on sites as well as making vote scripts and bots more difficult. One of the unfavorable effects of this new system is that after around 20 minutes, pages expire and you will have to refresh them in order to vote. The starbar will be notify you if this happens.

    Voting feedback.

      With the old votebar, voting was sort of "click-and-pray" in that you had no idea if your vote was registering or not. Due to the massive amount of vote lookups, user votes were loaded from a slave database, and if database replication failed, it would look as if your votes weren't registering even if they had. While we are still going to use a slave database for vote lookups, if a vote fails, you will get a message as to why. Some instances of this are:
      • You are trying to vote on or favorite a site you have not seen.
      • You are trying to vote on or favorite a site you own.
      • Your authentication has expired.
      • An internal error (which should almost never happen).

    Site-wide favorite recognition and vote loading

      The starbar now checks if you've got each YTMND on your favorites list across the entire site. Once it is out of beta, any remaining areas where your votes aren't shown or voting is not allowed (such as in search results) will be updated to show your vote/allow you to vote.

    Quick favorite and un-favorite

      A new starbar addon which is appended to the end of the starbar in some places (currently only on the site itself and the site profile) allows you to add a favorite to your list with a single click. Sites that are currently on your favorite list will allow you to "unfavorite" them when you hover over their starbar.

      Additionally, adding favorites has been changed so that when you add a site to your favorites list, it will automatically vote five on that site. When you "unfavorite" a site, the vote will remain. Once the starbar is out of beta, all previous favorites will be updated to five-star votes (if they were fav'd without voting, a new five-star vote will be added).

    Holy shit: Collaborative filtering.

      This is a feature that has been something I've devoted some free time to for over a year. For those of you that don't know what collaborative filtering is, it's when you take a massive amount of data on who likes what and use it to figure out what each user might like. Simply put; collaborative filtering allows us to predict what you will rate a site you haven't even seen based on how you've voted in the past.

      Sadly, the majority of you have no idea what an amazingly complicated feat this is to accomplish. The small amount of you that have dealt with this type of system in college or business dealings will understand how awesome it is that we are launching this on such a limited hardware platform.

      <technical jargon>

        I'll try to give an idea of how I accomplished this for the two or three of you who are interested in the technical side of this. First we gather the over twenty million YTMND votes into the memory of a C++ program I wrote which uses Simon Funk's SVD algorithm to calculate "feature" scores for each user and site. This consists of hundreds of billions of calculations and with the full YTMND data set, it takes roughly 70 minutes using 100% of a 2.4ghz AMD Opteron and around 500mb of memory. At this point we export the feature data to a SQL VIEW which allows us to calculate the prediction by performing (site features * corresponding user features) for any given user+site combination. We do this instead of storing each prediction because it would require (users*sites) rows (currently around 180 billion) in a database, most of which will never be accessed.

        Due to the nature of the algorithm, a single vote change or addition/removal recurses the entire tree. For instance, if a user votes on a site, that user's features have to be recalculated based on the new information and that site's features have to be recalculated as well; any site that user has voted on has to have its features updated based on the new user features scores as well as anyone who has voted on the site and any sites they've voted on etc. This means that the features can not be updated incrementally and we have to recalculate feature data in full every time. This can be done every few days or so and then imported into SQL.

        Since we have to calculate predictions on the fly, doing straight top-N recommendations isn't very easy as it requires a massive amount of calculations (sites*features (currently around 20 million)) and then a bubble sort on over 500,000 numbers for a single user. While it's doable, it doesn't really have any place in a production environment at the moment.

        We can create an accuracy score for each user by getting a list of their entire vote history and then comparing each vote to it's corresponding prediction. So if I vote 5 on a site and the prediction was 4.5, it was off by 0.5. When we average the remainder on all votes, we can create a score from zero to five based on how far off the predictions are on average. So to summarize it's one gigantic hack.

      </technical jargon>


      How this will affect you

        The system is very specific to your voting history, this means while some users will get really good predictions, others will get awful predictions. There are two major factors on how accurate your predictions will be: how many votes you've made in the past, how many votes have been made on each site you get a prediction for.

        Simply put, the closer your votes are to how you actually feel a site should be rated, the better your predictions will be. If you make a lot of five-star votes in the hopes that other people will return the favor, your predictions will be bad. If you down-vote out of spite, your predictions will be bad. If you have made very few votes or the site you are getting a prediction for has very few votes, it's likely the prediction will be off. So if you haven't been voting honestly, this system will be almost totally useless.

        Depending on how busy the system is, it may take a while to calculate predictions for you, this means that your previous votes and your predictions may take a while to show up on votebars. Depending on how badly YTMND shits the bed over the next few days, I may make predictions on option you can turn off if you feel it isn't useful for you. I will also add something for you to see how accurate your predictions will be on average once we are out of beta.

      How this will be used

        Since the system is so heavily based on vote history, and predictions are only updated at 24 hour intervals at most, this is really not very useful on the front page since the majority of sites there are relatively new and predictions would frequently be inaccurate. The main use of this feature is for browsing large lists of sites like on user profile pages or search results. Predictions will allow you to quickly figure out what sites you may enjoy more than others. The new color for predictions will be gold, which will mix with the current color of scores which is red so the resulting overlap color will be orange. You will not get predictions for sites that you've voted on/favorited already.

        Examples:

        A site with a score of 4.5 with a custom prediction of 3.0
        A site with a score of 1.0 with a custom prediction of 5.0
        A site with a score of 2.5 with a custom prediction of 2.0

        As you can see, the prediction mix color is subtle, this was done on purpose, as we don't want to influence peoples decisions so much as help them sift through a lot of garbage.


    If you encounter any problems or bugs, post a comment here and over the next day I'll try to hammer out any problems. Moving on...

    Much as I expected, the limited number of you technically capable of understanding the API had little to no interest in using it. There have been three entries to the API Contest so at this point in time, everyone who entered will get a "prize". You still have a couple weeks to enter, so come up with an idea and enter the contest so I don't feel like writing the API was a total waste.

    I know YTMND is suffering from "broken-window" syndrome, and I've been working on it a lot, I've been making a lot of small improvements both on the back and front end of the site. Obviously the main issue is now moderation, which is going to be the primary focus as I am now actively going over the technical design and coming up with a system that will be far more self-sufficient than the current "moderators clean up after users" setup.

    Hurray for gigantic news posts.

Add a comment

Please login or register to comment.
<< 1 2 >>
November 30th, 2007
346th
(0)
THX
November 28th, 2007
309th
(0)
I came up with this really great idea, but then saw some possible flaws with it so maybe it's not so great. Here it is anyway - One whole side of the front page be a box listing Users Who've Made Sites Today. Now I don't know how much the number of users who make sites would differ from number of sites, but it couldn't be more and might be a lot less. Have the 'today' represent 24 hours from last site made rather than having a daily cut off that would cause people to all post at the same time.
November 28th, 2007
310th
(0)
Have this box randomly rotate to feature everyone - kind of the way Worthwhile did. So each time you refresh the page there is a new random list of Users Who've Made A Site Today. A friendly request by you that people only post sites using one account per day would be enough for me to do that, certainly. Rather than stars perhaps there just either be some sort of icon by the name or not, like a green check mark, depending on how well the user's site / sites have been recieved
November 28th, 2007
311th
(0)
Now for the biggie: Instead of voting (Gasp!) have a Recommend This Site option. This would decide how the user is displayed in the new box. If the site is recommended enough it would display some sort of icon, as I said. If not, it simply doesn't. There would be no way to downvote someone. Either you think others will like the site or not - no big whoop. No bitterness, no mass alt downvoting, no trolls. And it would make people abusing alts to upvote themselves less of a factor...
November 28th, 2007
312th
(0)
So they used alts to get the stupid green checkmark icon - big whoop. People will be viewing these sites partly based on your past work - not on your willingness to exploit an honor system. Also, I guess you could do this and keep the voting system intact. Keep U&C, keep Top Rated, but use the Recommend This Site just to give some sort of heads up to vistor as to what name they should click on rather than a total crap shoot.
November 28th, 2007
314th
(0)
I think it's worth it because alt/buddy upvoting and troll/revenge downvoting have too much of an impact over what people see on YTMND. This new box and rating system would not only take away the need to cheat, it would render many of the oldest issues moot. And as a bonus it would lend itself to more of a profile based site, which you seem to mention in a lot of these news posts.
November 28th, 2007
315th
(0)
And a last thought on the Recommend This Site feature: You mentioned wanting to make it so your voting history would impact how much your vote counted. This feature might make this more possible. If you have bad taste then the impact of you clicking the Recommend button is lessened. Perhaps do a scale of 10. If your history of voting/recommending is exactly what the average is on each site your clicks are worth 10. If you are Fourest or The Punisher your clicks are worth 1, and so on
November 28th, 2007
316th
(0)
Again, abusing alts either way / just voting 1 or 5 / group upvoting - would have ZERO impact on the amount of time sites are featured in the new box. How the new box would impact the rest of the site is wide open, I think. It could not change it at all or you could redo the whole system if you wanted. --- I remember I said I had thought of some problems with this, but I've forgotten them, so pretend I didn't say that
November 28th, 2007
317th
(0)
Oh, and Featured Users is gay and you know it, but even that could be represented. Simply adjust the frequency (or odds, or whatever) that the pinkies show up in the new box. Not an obscene ratio compared to the rest of us second class citizens, but enough to keep ... whatever gay thing you are trying to accomplish by having featured users.
November 28th, 2007
318th
(0)
Certainly that is less complicated. But I like the idea of making the front page more user orientated rather than site. RC as is you see a title and it could be an all time great site by Nutnics or it could be a stupid screen shot news story with old fad music. Which is why people tend to avoid it because they assume they will see the good sites elsewhere. Or they only click on RC sites that have 5 stars lit up, which leads to vote whoring and alts. - Also, I realize what I said about rating the voters voting strength is a flawed, stupid idea, so ignore that bit
November 28th, 2007
319th
(0)
I think max is working on getting rid of the Featured Users box and replacing it with a box of sites recently made by your Favorite Users. This would most likely be much more effective than the current system. Unless your idea means you want to find underviewed sites by obscure users. Chances are the sites in Recommend This Site will probably be the same sites that are in U&C/Top Rated. Just keep browsing around YTMND until you find that undiscovered underviewed gem.
November 28th, 2007
320th
(1)
Voting is not going away, live with it. A recommendation system isn't a bad idea though, but if you give that power to everyone you'll get the same alt-abuse issues voting has. I still say, get rid of the featured users section, but keep featured users. Then give featured users and ONLY featured users the ability to recommend sites, and replace the featured users section with sites RECOMMENDED by featured users. That way, alt abuse is impossible, and it would allow the featured list to include everyone.
November 28th, 2007
321st
(0)
There are some featured users that have featured alts, you know, lol
November 28th, 2007
322nd
(0)
My thinking was that you take away the reason to alt vote. By expanding RC and having it rotate like the old Whorthwhile box -each site for a full day would make those who lack a moral compass less inclined to cheat - exposure would be there for every site "ir"regardless. And hopefully the Recommend feature would help the visitor avoid crap / noise sites by indicating which ones are well recieved. Sure FPA members would all recommend each others sites, but the % wouldn't be high enough to highlight it
November 28th, 2007
323rd
(0)
The key being that the sites be listed by User name rather than title. You see my name, Exfurry and Keaton - you know pretty much what you are getting with each and can view or not view based on what you like
November 28th, 2007
324th
(0)
and of course featured users could still show up as pink there
November 28th, 2007
326th
(0)
Hey, I can't write music but I know a good song when I hear it.
November 29th, 2007
328th
(0)
Then it would go from "Dude 5 my site" to "Dude feature my site". I don't think you'd enjoy being part of that very much. But it would probably cut down on Pinks getting downvoted, lol
November 29th, 2007
335th
(3)
So this is the third really long reply thread for this news post that can only be summarized with "Ban Fourest." Should we make another one?
November 29th, 2007
338th
(0)
Just remember Ollj is a featured user. There are a ton of featured users who got there because of one or two sites.
November 29th, 2007
342nd
(1)
I know you weren't directing that comment at me, my point was simply that people who make bad ytmnds don't necessarily have bad taste in ytmnds. Therefore, to sum up, in conclusion, ipso facto... ban fourest.
November 30th, 2007
344th
(0)
omg, did you really read this? why?
November 29th, 2007
329th
(0)
Whoopdy sh*t.. How about some moderation?
November 29th, 2007
330th
(0)
Am I the only one who knows that Larry David did the voice of Steinbrenner on Seinfeld or is that common knowledge?
November 29th, 2007
337th
(0)
He was also the man in the cap and the man in the greenpeace boat along as the original voice of Newman.
November 29th, 2007
340th
(0)
his contributions went even as far as being co-creator
November 29th, 2007
334th
(2)
Get rid of those stupid ads with the retarded looking cats slipping each other the tongue. It's embarrassing going on YTMND when I'm in a public setting with that ad there. I bet people are wondering if I'm watching kitty porn.
November 29th, 2007
339th
(2)
I agree, they're creepy as hell.
November 30th, 2007
345th
(0)
Seriously, I'd rather have that gay butt sex ad we had a while back.
November 30th, 2007
348th
(0)
Those f*cking cats made me turn on adblock. They are that annoying.
December 1st, 2007
350th
(1)
Okay, the fav users page is live (kinda), check it out: http://wiki.ytmnd.com/YTMND:API:Contest#Favorite_Users_Lists_by_BTape
December 2nd, 2007
352nd
(0)
What song is this? What is it from? PUHLEEZ HALP. Respond here or PM me. http://www.sendspace.com/file/az9hwl
December 2nd, 2007
353rd
(0)
Never mind, found it. Helloween's "Deliberately Limited Preliminary Prelude Period in Z." Hanson uses it as their intro music, and I'm not even f*cking kidding.
December 3rd, 2007
360th
(0)
Because I am DOWN with Isaac, Taylor, and Zac.
December 3rd, 2007
354th
(1)
Anyone else think the minimum rating needed to appear on the front page's top viewed should be raised from 2.5 to say, 3.75? If the idea was to keep crap sites that happened to be submitted at midnight or linked from elsewhere from showing up on the front page, it's not succeeding, because there are a lot of complete wastes of Internet that can manage above a 2.5.
December 3rd, 2007
355th
(0)
yea, 3.5 would be fair
December 3rd, 2007
356th
(2)
"If you happen to not like it, don't look at it." Um, too late?
December 3rd, 2007
357th
(0)
Very good idea - and a great user name to boot
December 6th, 2007
363rd
(0)
the point is to show highly viewed sites that aren't highly sh*tty
December 6th, 2007
364th
(0)
a rating cap (especially higher, around 3.5 as I suggested) just helps weed out the bad sites that are forced onto top viewed improperly
December 7th, 2007
365th
(2)
No it doesn't, lol
December 7th, 2007
366th
(1)
It comes down to how most people want as little crap as possible on the front page. Someone might sponsor a bad site, or a featured user might make something lousy, but those sections are avoidable. Most people see the Top Viewed as too integral a part of the front page to be just a mere gauge of what people are looking at regardless of actual quality.
December 8th, 2007
368th
(0)
Atrahasis, the Top Viewed is self perpetuating. If it starts the day on Top Viewed it will usually stay on Top Viewed even if it's not in any other front page box. The only exceptions being sites that fall below 2.5. That's an insanely low score. 3.5 is a good mark. 3.75 is probably a little too high
December 3rd, 2007
358th
(0)
I will assume we will be able to sort by predicted rating? Plus do you believe this 'Collaborative filtering', even with the light tint, will be very likely to invluence potential votes? An example being if someone sees the yellow stars a say 4 or five one will not grant that vote simply because in their eyes it has already been granted.
December 3rd, 2007
361st
(4)
LOL BUBBLE SORT
December 8th, 2007
369th
(-1)
It's Click week on YTMND.
December 8th, 2007
370th
(1)
when i'm on a user's favorites list, and then click on the "sites" tab, it directs me to their votes. please fix this.
December 11th, 2007
372nd
(2)
please.
December 10th, 2007
371st
(0)
truly awesome you have created a why to siphon all ytmnd's from the past into the future, decent perfectly respectable posts that have not seen the light of day in year i solute you
December 12th, 2007
373rd
(0)
BossN*gg*r has upwards of 15 alt accounts and growing. Delete please mods, thanks.
December 13th, 2007
374th
(0)
coding sk1llz so epic I came many times when I saw it.
December 14th, 2007
377th
(0)
Ban Umfuld
December 15th, 2007
379th
(0)
Ban Max
December 15th, 2007
380th
(0)
Ban Ashleigh Banfield
December 14th, 2007
378th
(0)
Max why am I like this? Why? I MISS THE OLD MEMORIES I HAD ON HERE, GIVE THEM BACK TO ME NOW OR I WILL BREAK THE LAW.
January 22nd, 2010
382nd
(0)
This makes no sense now.
February 22nd, 2010
383rd
(0)
Hi Guise
<< 1 2 >>