Archive for the 'Conferences' Category

Code Repository for Machine Learning: mloss.org

The folks at mloss.org — Machine Leaning Open Source Software — invited a blog post on my roundtable on data and code sharing, held at Yale Law School last November. mloss.org’s philosophy is stated as:

“Open source tools have recently reached a level of maturity which makes them suitable for building large-scale real-world systems. At the same time, the field of machine learning has developed a large body of powerful learning algorithms for a wide range of applications. Inspired by similar efforts in bioinformatics (BOSC) or statistics (useR), our aim is to build a forum for open source software in machine learning.”

The site is excellent and worth a visit. The guest blog Chris Wiggins and I wrote starts:

“As pointed out by the authors of the mloss position paper [1] in 2007, “reproducibility of experimental results is a cornerstone of science.” Just as in machine learning, researchers in many computational fields (or in which computation has only recently played a major role) are struggling to reconcile our expectation of reproducibility in science with the reality of ever-growing computational complexity and opacity. [2-12]

In an effort to address these questions from researchers not only from statistical science but from a variety of disciplines, and to discuss possible solutions with representatives from publishing, funding, and legal scholars expert in appropriate licensing for open access, Yale Information Society Project Fellow Victoria Stodden convened a roundtable on the topic on November 21, 2009. Attendees included statistical scientists such as Robert Gentleman (co-developer of R) and David Donoho, among others.”

keep reading at http://mloss.org/community/blog/2010/jan/26/data-and-code-sharing-roundtable/. We made an effort to reference efforts in other fields regarding reproducibility in computational science.

What's New at Science Foo Camp 2009

SciFoo is a wonderful annual gathering of thinkers about science. It’s an unconference and people who choose to speak do so. Here’s my reaction to a couple of these talks.

In Pete Worden’s discussion of modeling future climate change, I wondered about the reliability of simulation results. Worden conceded that there are several models doing the same predictions he showed, and they can give wildly opposing results. We need to develop the machinery to quantify error in simulation models just as we routinely do for conventional statistical modeling: simulation is often the only empirical tool we have for guiding policy responses to some of our most pressing issues.

But the newest I saw was Bob Metcalfe’s call for us to imagine what to do with the coming overabundance of energy. Metcalfe likened solving energy scarcity to the early days of Internet development: because of the generative design of Internet technology, we now have things that were unimagined in the early discussions, such as YouTube and online video. According to Metcalfe, we need to envision our future as including a “squanderable abundance” of energy, and use Internet lessons such as standardization and distribution of power sources to get there, rather than building for energy conservation.

Cross posted on The Edge.

Bill Gates to Development Researchers: Create and Share Statistics

I was recently in Doha, Qatar, presenting my research on global communication technology use and democratic tendency at ICTD09. I spoke right before the keynote, Bill Gates, whose main point was that when you engage in a goal-oriented activity, such as development, progress can only be made when you measure the impact of your efforts.

Gates paints a positive picture, measured by deaths before age 5. In the 1880’s he says about 30% of children died before their 5th birthday in most countries, and this gradually moved to 20 million in 1960 and then 10 million in 2006. Gates postulates this is due to rising income levels (40% of decrease), and medical innovation such as vaccines (60% of decrease).

This is an example of Gates’ mantra: you can only improve what you can measure. For example, an outbreak of measles tells you your vaccine system isn’t functioning. In his example about childhood deaths, he says we are getting somewhere here because we are measuring the value for money spent on the problem.

Gates thinks the wealthy in the world need to be exposed to these problems ideally through intermingling, or since that is unlikely to happen, through statistics and data visualization. Collect data, then communicate it. In short, Gates advocates creating statistics through measuring development efforts, and changing the world by exposing people to these data.

Stuart Shieber and the Future of Open Access Publishing

Back in February Harvard adopted a mandate requiring its faculty member to make their research papers available within a year of publication. Stuart Shieber is a computer science professor at Harvard and responsible for proposing the policy. He has since been named director of Harvard’s new Office for Scholarly Comminication.

On November 12 Shieber gave a talk entitled “The Future of Open Access — and How to Stop It” to give an update on where things stand after the adoption of the open access mandate. Open access isn’t just something that makes sense from an ethical standpoint, as Shieber points out that (for-profit) journal subscription costs have risen out of proportion with inflation costs and out of proportion with the costs of nonprofit journals. He notes that the cost per published page in a commercial journal is six times that of the nonprofits. With the current library budget cuts, open access — meaning both access to articles directly on the web and shifting subscriptions away from for-profit journals — is something that appears financially unavoidable.

Here’s the business model for an Open Access (OA) journal: authors pay a fee upfront in order for their paper to be published. Then the issue of the journal appears on the web (possibly also in print) without an access fee. Conversely, traditional for-profit publishing doesn’t charge the author to publish, but keeps the journal closed and charges subscription fees for access.

Shieber recaps Harvard’s policy:

1. The faculty member grants permission to the University to make the article available through an OA repository.

2. There is a waiver for articles: a faculty member can opt out of the OA mandate at his or her sole discretion. For example, if you have a prior agreement with a publisher you can abide by it.

3. The author themselves deposits the article in the repository.

Shieber notes that the policy is also because it allows Harvard to make a collective statement of principle, systematically provide metadata about articles, it clarifies the rights accruing to the article, it allows the university to facilitate the article deposit process, it allows the university to negotiate collectively, and having the mandate be opt out rather than opt in might increase rights retention at the author level.

So the concern Shieber set up in his talk is whether standards for research quality and peer review will be weakened. Here’s how the dystopian argument runs:

1. all universities enact OA policies
2. all articles become OA
3. libraries cancel subscriptions
4. prices go up on remaining journals
5. these remaining journals can’t recoup their costs
6. publishers can’t adapt their business model
7. so the journals and the logistics of peer review they provide, disappear

Shieber counters this argument: 1 through 5 are good because journals will start to feel some competitive pressure. What would be bad is if publishers cannot change their way of doing business. Shieber thinks that even if this is so it will have the effect of pushing us towards OA journals, which provide the same services, including peer review, as the traditional commercial journals.

But does the process of getting there cause a race to the bottom? The argument goes like this: since OA journals are paid by the number of articles published they will just publish everything, thereby destroying standards. Shieber argues this won’t happen because there is price discrimination among journals – authors will pay more to publish in the more prestigious journals. For example, PLOS costs about $3k, Biomed Central about $1000, and Scientific Publishers International is $96 for an article. Shieber also makes an argument that Harvard should have a fund to support faculty who wish to publish in an OA journal and have no other way to pay the fee.

This seems to imply that researchers with sufficient grant funding or falling under his proposed Harvard publication fee subsidy, would then be immune to the fee pressure and simply submit to the most prestigious journal and work their way down the chain until their paper is accepted. This also means that editors/reviewers decide what constitutes the best scientific articles by determining acceptance.

But is democratic representation in science a goal of OA? Missing from Shieber’s described market for scientific publications is any kind of feedback from the readers. The content of these journals, and the determination of prestige, is defined solely by the editors and reviewers. Maybe this is a good thing. But maybe there’s an opportunity to open this by allowing readers a voice in the market. This could done through ads or a very tiny fee on articles – both would give OA publishers an incentive to respond to the preferences of the readers. Perhaps OA journals should be commercial in the sense of profit-maximizing: they might have a reason to listen to readers and might be more effective at maximizing their prestige level.

This vision of OA publishing still effectively excludes researchers who are unable to secure grants or are not affiliated with a university that offers a publication subsidy. The dream behind OA publishing is that everyone can read the articles, but to fully engage in the intellectual debate quality research must still find its way into print, and at the appropriate level of prestige, regardless of the affiliation of the researcher. This is the other side of OA that is very important for researchers from the developing world or thinkers whose research is not mainstream (see, for example, Garrett Lisi a high impact researcher who is unaffiliated with an institution).

The OA publishing model Shieber describes is a clear step forward from the current model where journals are only accessible by affiliates of universities who have paid the subscription fees. It might be worth continuing to move toward an OA system where, not only can anyone access publications, but any quality research is capable of being published, regardless of the author’s affiliation and wealth. To get around the financial constraints one approach might be to allow journals to fund themselves through ads, or provide subsidies to certain researchers. This also opens up the idea of who decides what is quality research.

A2K3: Connectivity and Democratic Ideals

Also in the final A2K3 panel, The Global Public Sphere: Media and Communication Rights, Natasha Primo, National ICT policy advocacy coordinator for the Association for Progressive Communications, discusses three questions that happen to be related to my current research. 1) Where is the global in the global public sphere? 2) Who is the public in the global public sphere? and 3) How to we get closer to the promise of development and the practice of democratic values and freedom of expression?

She begins with the premise that we are in an increasingly interconnected world, in economic, political, and social spheres, and you will be excluded if you are not connected. She also asserts the premise that connection to the internet can lead to the opening of democratic spaces and – in time – a true global public sphere.

Primo, like Ó Siochrú in my blog post here, doesn’t see any global in global public sphere. She thinks this is just a matter of timing, and not a systematic problem. She notes that the GSM organization predicts 5 billion people on the GSM network by 2015, whereas we now have 1 of 6 billion connection to the internet> note that Primo believes internet access will come through the cell phone for many people who are not connected today. She refers us to Richard Heeksproposal for a Blackberry-for-development. Heeks is professor and chair of the Development Informatics Department at the University of Manchester. But Primo sees the cost as the major barrier to connectivity among LCDs and thinks this will abate over time.

With regard to the cost of connectivity, she notes that Africa has a 10% internet subscription rate versus in Asia-Pacific and 72% in Europe. South Africa is planning an affordable broadband campaign: to have some facilities declared ‘essential’ to make them available to the public at cost to the service providers. This comes from the A2K idea of partnership for higher education in Africa – African universities are to have cheaper access. She also sees authoritarian behavior by states as another obstacle to connectivity. She cites research by our very own OpenNet Initiative that 24 of 40 countries studied are filtering the internet and using blocking tools to prevent freedom of expression. This is done via blocking blogging sites and YouTube. She is worried about how this behavior by governments impacts peoples’ behavior when they are online. She notes surveys that show two extreme reactions: people either practice substantial selfcensorship or put their lives on the line for the right to express an opinion.

Primo notes the cultural obstacles to the global public sphere. She relates a story that some groups are not accustomed to hearing opinions that diverge from their own and will, innocently, flag them as inappropriate content. Dissenting opinions come back online after a short amount of time, but with the delay dialogue can be lost.

A2K3: Communication Rights as a Framework for Global Connectivity

In the last A2K3 panel, entitled The Global Public Sphere: Media and Communication Rights, Seán Ó Siochrú made some striking statements based on his experience building local communication networks in undeveloped areas of LCDs. He states that the global public sphere is currently a myth, and what we have now is elites promoting their self-interest. He criticizes the very notion of the global public sphere – he wants a more dynamic and broader term that reflects the deeper issues involved in bringing about such a global public sphere. He prefers to frame this issue in terms of communication rights. By this he means the right to sek and receive ideas, generate ideas and opinions of one’s own, speaks these ideas, have a right to be heard, and a right to have others listen. These last two rights Ó Siochrú dismisses as trivial but I don’t see that they are. Each creates a demand on others’ time that I don’t see how to effectuate within the framework of respect for individual automony Belkin elucidated in his keynote address and discussed in my recent blog post and on the A2K blog.

Ó Siochrú also makes an interesting point that if we are really interested in facilitating communication and connection between and by people who have little connectivity today, we are best to concentrate on technologies such as the radio, email, mobile phones, the television, or whatever works at the local level. He eschews blogs, and the internet, as the least acessible, least affortable, and the least usable.

A2K3: Opening Scientific Research Requires Societal Change

In the A2K3 panel on Open Access to Science and Research, Eve Gray, from the Centre for Educational Technology, University of Cape Town, sees the Open Access movement as a real societal change. Accordingly she shows us a picture of Nelson Mandela and asks us to think about his release from prison and the amount of change that ushered in. She also asks us to consider whether or not Mandela is an international person or a local person. She sees a parallel with how South African society changed with Mandela and the change people are advocation toward open access to research knowledge. She shows a worldmapper.org map of countries distorted by the amount of (copyrighted) scientific research publications. South Africa looks small. She blames this on South Africa’s willingness to uphold colonial traditions in copyright law and norms in knowledge dissemination. She says this happens almost unquestioningly, and in South Africa to rise in the research world you are expected to publish in ‘international’ journals – the prestigious journals are not South African, she says (I am familiar with this attitude from my own experience in Canada. The top American journals and schools were considered the holy grail. When I asked about attending a top American graduate school I was laughed at by a professor and told that maybe it could happen, if perhaps I had an Olympic gold medal.) She states that for real change in this area to come about people have to recognize that they must mediate a “complex meshing” of policies: at the university level, and the various government levels, norms and the individual scientist level… just as Mandela had to mediate a large number of complex policies at a variety of different levels in order to bring about the change he did.