サクサク読めて、アプリ限定の機能も多数!
トップへ戻る
大そうじへの備え
developer.yahoo.net
Thanks to the around 175 developers who came to Yahoo! recently for our monthly Hadoop User Group meeting. The energy in the packed room was phenomenal, and conversations continued long after the formal sessions. Hundreds of Hadoop Fans Flock to Yahoo! for the Hadoop User Group The event started with Arun Murthy from Yahoo! describing the best practices for developing MapReduce applications. Arun
Apache Hadoop is a software framework to build large-scale, shared storage and computing infrastructures. Hadoop clusters are used for a variety of research and development projects, and for a growing number of production processes at Yahoo!, EBay, Facebook, LinkedIn, Twitter, and other companies in the industry. It is a key component in several business critical endeavors representing a very sign
Today we’re making some important announcements on the transition of our Search back-end infrastructure to Microsoft, and how this transition impacts the Search APIs and web services we offer on the Yahoo! Developer Network. We are also sharing specific news about several of our other developer services. Over recent years, Yahoo! has made a commitment to developers by opening products, services,
Thanks to the over 200 developers who came to Yahoo! recently for our monthly Hadoop User Group meeting. The energy in the packed room was phenomenal, and conversations continued long after the formal sessions. Hundreds of Hadoop Fans Flock to Yahoo! for the Hadoop User Group The event started with Nitin Motgi from Yahoo! describing the challenge of content optimization at scale and how Yahoo! is
Thanks to Tsz-Wo Nicholas Sze and Mahadev Konar for this article. The Problem of Many Small Files The Hadoop Distributed File System (HDFS) is designed to store and process large (terabytes) data sets. At Yahoo!, for example, a large production cluster may have 14 PB disk spaces and store 60 millions of files. However, storing a large number of small files in HDFS is inefficient. We call a file sm
The WebTiming spec — proposed by Google to the W3C — is an important step forward in measuring a Web page's round-trip time (the time between a user requesting a page and the page becoming usable). In the past, we've had to either approximate the value by putting a timer at the start of our document, or use a cookie to store the time when the previous page's onbeforeunload* event fired. Both appr
NodeJS has been garnering a lot of attention late. Briefly, NodeJS is a server-side JavaScript runtime implemented using Google's highly performant V8 engine. It provides an (almost) completely non-blocking I/O stack which, when combined with JavaScript closures and anonymous functions, makes it an excellent platform for implementing high throughput web services. As an example, a simple "Hello, wo
YDN Hadoop and Distributed Computing at Yahoo! Managing Big Data: Architectural Approaches for making batch data available online This is the beginning of an ongoing series of blog posts on “Managing Big Data”. This series will focus on techniques that Yahoo uses to process large volumes of data, ranging from initial collection of data to the end usage of that data. Introduction Over the last seve
Several years ago, we released a little FireBug plugin called YSlow. This tool allows you to analyze your Web pages for performance problems. Among the things that YSlow can do includes measuring page-load time, and beaconing its results back to a home server. Since then, the front-end performance discipline has grown significantly with page-speed and page-test tools, and also services that trend
Yahoo! will soon open up the platform behind the popular Yahoo! Messenger service. This is big news. If you haven’t used Yahoo! Messenger, you’re in for a treat. Yahoo! Messenger is the premier instant messaging (IM) platform, used on a wide variety of desktop and mobile clients. Millions of users throughout the world already depend on Yahoo! Messenger to manage their social contacts, group lists,
YDN Hadoop and Distributed Computing at Yahoo! Pig, Cascalog & HBase Among Highlights of May Hadoop Meet-Up Hi Hadoopers Thanks to close to 300 developers who came this week to Yahoo! for our monthly Hadoop User Group meeting. The energy in the packed room was phenomenal and conversations continued long after the formal sessions. Hundreds of Hadoop Fans Flock to Yahoo! for the May Hadoop User Grou
You probably have questions like these about traffic on a TCP (Transmission Control Protocol) server (or client): How many connections lasted more (or less) than X milliseconds? How many connections needed more than N attempts to succeed? What is the distribution of connection duration or connection throughput? What is the distribution of connection duration or throughput for connections in which
The bottom line is that we achieved the target in Petabytes and got close to the target in the number of files. But this is done with a smaller number of nodes and the need to support a workload close to 100,000 clients has not yet materialized. The question now is whether the goals are feasible with the current system architecture. Namespace Limitations HDFS is based on an architecture where the
Last week Yahoo! launched Earth Day campaign with a dedicated page, plus other campaign features spread across Yahoo! properties. Being the frontend engineer in charge of implementing such features for Yahoo! Search Team and a tree-hugger, I had to find a way to go green on this task. So, why not optimize it to the bone? The page itself is pretty light and hasn't many features besides the main sea
More and more websites are integrating with third-party services, including logging in and content sharing. Users who log in with existing accounts are far more valuable than users who register new accounts. And that's where XAuth (eXtended Authentication) comes in. Unlike newly registered accounts, existing third-party accounts have rich profile data and services capable of driving tremendous ref
At Yahoo!, grids running Hadoop have attracted a wide range of applications from a diverse set of functional groups. The workload submitted by each group is distinguished not only from others in the cluster, but its profile also changes as users gain experience with Hadoop. Users tune their jobs to consume available resources; some may circumvent the assumptions and fairness control mechanisms of
This article is the second in a series and part of ongoing research on web app performance. Get updates on the latest YDN articles via Twitter, follow @ydn. The Domain Name System (DNS) is part of the "dark matter" of the internet. It's hard to observe the DNS directly yet it exerts an obscure, pervasive influence without which everything would fly apart. Because it's so difficult to probe people
YQL is a great tool to scrape HTML from the web and turn it into data to reuse. This is not an illegal act as it can be very useful to reuse information maintained for example on a blog. My personal portfolio page http://icant.co.uk gets most of its data from my blog hosted elsewhere. Using the in-built YQL table for html allows you to scrape any HTML that allows the YQL server to access it (some
Yahoo! engineers love hacking their iPhones, and that's why we're happy to announce that the Yahoo! OpenID and OAuth services are now iPhone optimized! Yahoo! users no longer need to pinch and spread when signing into websites with their Yahoo! OpenID, or when authorizing data sharing using OAuth. OpenID streamlines the sign in and registration process, making it easy for users to reuse an account
Web app developers spend most of our time not thinking about how data is actually transmitted through the bowels of the network stack. Abstractions at the application layer let us pretend that networks read and write whole messages as smooth streams of bytes. Generally this is a good thing. But knowing what's going underneath is crucial to performance tuning and application design. The character o
Introduction In a typical Hadoop MapReduce job, input files are read from HDFS. Data are usually compressed to reduce the file sizes. After decompression, serialized bytes are transformed into Java objects before being passed to a user-defined map() function. Conversely, output records are serialized, compressed, and eventually pushed back to HDFS. This seemingly simple, two-way process is in fact
The Yahoo! Query Language lets you query, filter, and join data across any web data source or service on the web. Using our YQL web service, apps run faster with fewer lines of code and a smaller network footprint. YQL uses a SQL-like language because it is a familiar and intuitive method for developers to access data. YQL treats the entire web as a source of table data, enabling developers to sel
Over the years, many developers have asked us to make our Auth UIs (the user interface for logging in and verifying user ID and password) less jarring and disruptive. Until today, all of our authorization services, including OAuth, OpenID, and BBAuth used a "redirect" UI, which required sites to redirect the user's browser over to Yahoo! to ask for the user's approval before sharing the user's dat
There's a paradigm shift going on in the industry about how we deal with computing infrastructure. Services such as Amazon Web Services, Google App Engine, and various other cloud infrastructure providers are changing the way that companies think about writing, hosting, and deploying web applications. At Yahoo!, we're also getting on the cloud. Our deep involvement with Hadoop is the best-known ex
All Yahoo! services using OAuth are now upgraded to the new OAuth 1.0a version of the protocol, resolving the session fixation security issue. The upgraded services include all Y!OS APIs (Contacts, Updates, Status, and Social Directory) and Fire Eagle. Users authorizing applications using OAuth 1.0a will not see the security interstitial screen that is displayed for apps that are still using the o
You might have heard about the Yahoo! Query Language (YQL) by now. I know I've been banging on about it enough. As we say in Britain: it's the mutt's nuts. Get Web data but do it like calling a database. That's pretty sweet. With YQL Open Tables you can contribute back to YQL by sharing your own tables for web services you like. Currently we have almost 100 community Open Tables. You can see them
In the last release I took on the task of setting up a true system test environment for Apache ZooKeeper. Our previous environment ran the system test in a single JVM instance, which meant that there were some test scenarios that we just couldn't reproduce. In this new environment we wanted to be able to run tests across multiple hosts and deal with different numbers of machines and cluster enviro
次のページ
このページを最初にブックマークしてみませんか?
『Yahoo! Developer Network Home - Welcome!』の新着エントリーを見る
j次のブックマーク
k前のブックマーク
lあとで読む
eコメント一覧を開く
oページを開く