emscripten pour améliorer kiwix-js

Au dernier hackathon Kiwix, on a joué un peu avec emscripten pour améliorer kiwix-js.

Activation des optimisations asm.js sur xzdec

La décompression XZ était déjà assurée par du code généré depuis du C par emscripten : xzdec.js.

Mais, il s’agissait de code javascript « almost asm » (il y avait cette chaine dans le code javascript), qui ne permettait pas aux navigateurs d’activer les optimisations asm.js.

En le recompilant avec d’autres options (cf https://github.com/kiwix/kiwix-js/issues/219), on a pu passer en « use asm ». Résultat : la décompression XZ est plus de 2 fois plus rapide.

On avait initialement prévu de compiler en WebAssembly, mais ça n’apportait pas de gain de performance significatif dans notre cas de figure. Et ça aurait limité les plateformes compatibles (oui, on supporte de vieux moteurs HTML/javascript sur kiwix-js).

Compilation de libzim avec emscripten

Dattaz avait déjà réussi à compiler la libzim avec emscripten : https://github.com/dattaz/libzim_wasm avec un prototype très prometteur : https://dattaz.github.io/libzim_wasm/file_api/index.html.

Je suis reparti de son code pour l’adapter à la dernière version de la libzim (qui a beaucoup évolué entre-temps) : https://github.com/mossroy/libzim_wasm. Il a fallu que j’applique quelques patchs sur le code (en mode quick-and-dirty) pour que la compilation fonctionne, et soit reproductible.

J’ai ensuite fait évoluer un peu le prototype pour permettre l’appel de fonctions C++ depuis le javascript, sur un cas simple : https://mossroy.github.io/libzim_wasm/. Ca fonctionne !

Je n’ai pas eu le temps d’aller plus loin que ce bout de prototype pour l’instant, mais ça ouvre la porte à une future utilisation de la libzim dans kiwix-js. On peut espérer de meilleures performances (voir les commentaires de https://github.com/kiwix/kiwix-js/issues/116), ainsi que l’accès à toutes les fonctionnalités qu’on n’avait pas implémentées en javascript.

Par contre, il y a un bug avec les fichiers de plus de 2Go (qui avait déjà été repéré par dattaz l’an dernier) : https://github.com/kripken/emscripten/issues/5250. kripken « himself » a regardé, et a conclu qu’il s’agit d’une limitation de emscripten, mais qui ne semble pas facile à corriger (il l’a taggé en « help wanted » sur github).

Enfin, il faudra peut-être compiler et utiliser kiwix-lib plutôt que libzim pour avoir des APIs d’un peu plus haut niveau.

Wikimedia Foundation and Kiwix partner to grow offline access to Wikipedia

Photo by Zack McCune/Wikimedia Foundation, CC BY-SA 4.0.

The Wikimedia Foundation has announced a partnership with Kiwix, the free and open-source software solution that enables offline access to educational content, to expand and improve access to Wikipedia and other Wikimedia projects globally. This partnership will include a $275,000 contribution to Kiwix to further enhance offline access to Wikipedia in parts of the world where consistent, affordable internet connectivity presents a significant barrier to accessing Wikipedia.

“Our hope is that one day everyone will have access to the internet, and eliminate the need for other offline methods of access to information.” said Kiwix CEO Stephane Coillet-Matillon. “But we know that there are still serious gaps in internet access globally that require solutions today. Kiwix is a tool to start fixing things right now.”

The Wikimedia Foundation and Kiwix have had a long-standing collaborative relationship to expand access to Wikipedia around the world. This includes recent support to Kiwix and WikiProject Medicine to improve the availability of offline Wikipedia medical content, as well as improvements to the Kiwix desktop experience.

Through this partnership, the two organizations will collaborate to create a long-term strategy for third party reuse of Kiwix’s free access platform, fix longstanding code debt, improve Kiwix’s usability across mobile platforms including Android, and integrate Kiwix’s and the Wikimedia Foundation’s technical operations more closely for improved Wikipedia offline experiences.

“As part of the 2030 direction for Wikimedia’s future, we’re thrilled to be partnering with Kiwix to invest in solutions to address one of the critical barriers to participating in Wikipedia globally: reliable internet access,” said Anne Gomez, Senior Program Manager at the Wikimedia Foundation. “We have made a commitment as an organization to actively address the challenges and barriers to reaching our global Wikimedia vision: a world in which everyone can freely share in knowledge. Today marks an important step toward realizing that commitment.”

The Wikimedia vision is global: a world in which everyone can freely share in the sum of all knowledge. While there has been a significant reduction in high mobile data costs and other barriers to participating in Wikipedia, more than half the world’s population is not yet online.

Today, Kiwix sits at the heart of the offline ecosystem with more than 3 million users from more than 200 countries. It can store millions of Wikipedia articles from any of Wikipedia’s nearly 300 languages along with thousands of books and videos on a single flash drive or microSD card for access on smartphones and computers. Kiwix has also worked with nonprofits such as the Orange Foundation, Human Rights Foundation, Internet in a Box, WikiFundi, and Digisoft to scale distribution of offline education materials around the world to students, teachers, and the general public.

More information about the Wikimedia Foundation’s work to expand access and participation to Wikipedia globally, including information about this partnership with Kiwix, can be found in the Wikimedia Foundation’s 2018-2019 annual plan.

About the Wikimedia Foundation

The Wikimedia Foundation is the nonprofit organization that supports and operates Wikipedia and its sister free knowledge projects. Wikipedia is the world’s free knowledge resource, spanning more than 45 million articles across nearly 300 languages. Every month, more than 200,000 people edit Wikipedia and the Wikimedia projects, collectively creating and improving knowledge that is accessed by more than 1 billion unique devices every month. This all makes Wikipedia one of the most popular web properties in the world. Based in San Francisco, California, the Wikimedia Foundation is a 501(c)(3) charity that is funded primarily through donations and grants.

About Kiwix

Kiwix is an open-source software that brings internet content to millions of people without internet access – be it because of cost, poor infrastructures or even censorship. Websites like Wikipedia, TED talks, the Gutenberg library and many more can be stored and browsed as if users were online. Kiwix is available in more than 100 languages, and runs on all major desktop and mobile platforms. Based in Lausanne, Switzerland, Kiwix Association is a registered Swiss Verein that is funded solely through donations and grants. For more information, see http://www.kiwix.org.

Offline-Pedia converts old televisions into Wikipedia readers

 

There are villages in the Ecuadorian Andes that are so small you cannot find them on a map. Cajas Juridica is one such place, located just 13km north of the equator. But two engineering students, Joshua Salazar and Jorge Vega, and the staff of Yachay Tech University have figured out a way to give discarded CRT TV screens a second life, using Kiwix—an offline Wikipedia reader—to bring Wikipedia to these communities. Josh Salazar told us more about their Offline-Pedia project.

Tell me about what started your interest and involvement with offline Wikipedia: are you a Wikipedian?

Since I was a kid I always loved encyclopedias, especially the digital ones because of the multimedia content, I was a really big fan of Encarta. I remember that in the school I attended, there was a computer with Encarta which they used to rent. I really loved to surf through all of its topics and stuff, and in general every kid loved it too, and every kid in that school did as well. My interest really started there, because one of the main problems was that they rented the computers, and also there was no internet connection. A cheap, available-for-anyone computer including the whole Wikipedia would have being perfect in such places! That’s why I became really motivated to make Wikipedia available for rural communities such as the one where I grew up.

I just have edited some small sections of a couple of articles, already… but I’m looking forward to have more time to write more when finishing my bachelor degree.

What is Offline-Pedia? How does it work?

Offline-Pedia started as a project focused on setting up computers with Wikimedia content, using low cost and recycled materials, such as wood for the case and old CRT TVs as screens, for rural communities.

The usage of free hardware and free software (a Raspberry Pi, Raspbian: a Debian based OS, Kiwix) turns the installation of one device into something very cheap to make. By getting an old TV, everything can cost around USD 100 per device. Thus, one of the main goals is to upcycle old CRT TVs and other kind of compatible screens.

Why CRT TVs? Well, because in Ecuador, at the of this year (2018) the main TV broadcast signal mode will switch from analog to digital signal: then a lot of TVs will become electronic waste. With the project, we aim to solve two problems: the difficulty of accessing the internet in rural communities, and handling the future electronic waste by reusing the obsolete TVs.

How did you find out about Kiwix?

I knew of its existence since I was in the last year of high school, but actually don’t remember exactly where I read about it. I’ve been always subscribed to Free Software and Linux newsletters and pages, maybe in one of those.

What’s the hardest part of the technology to deal with (for you on the one side, and for the end users on the other)?

Compiling the program source code from scratch into the Raspberry Pi. You know, there are always more dependencies that rely on other dependencies to work, or forgetting the folder paths of the previously installed packages, and so on… but it was fun. After making it work for the first time, just recently actually, we found out that precompiled versions of Kiwix (but just the server) existed. I became a little upset because I even missed some classes for compiling it on time to fulfill the promise we made to the community. Anyways, later the Kiwix guys also shared with me a beta automatic installer and downloader for preparing microSD disk images for the Raspberry Pi. That will save us a lot of time for the next Offline-Pedias we will be preparing!

The technical difficulties with the users were more related to the ‘techno phobia’ of the elderly people, but the youth and children really loved it!

What happened when people started using your box? What lessons can you share with other people who might want to do similar projects?

They started to look for literature! ‘Romeo and Juliet’ in the Gutenberg Project Library, and the biography of William Shakespeare in Wikipedia. That was something we would never have imagined!

I encourage people to replicate the project in whatever and wherever place they are, because there always are curious people and helping them access to the biggest reference platform in the world makes you feel very inspiring, very accomplished and happy.

What’s been the biggest surprise? The biggest challenge?

The young people looking for English Literature was very unexpected. That demonstrated to us that we don’t really know what the people of the community want from us. So, a big challenge will be to properly learn from the people. But luckily, a professor on our University ‘Yachay Tech’, Sergio Minniti, who works developing ‘critical technical practice’ prototypes with students decided to help us. He is a sociologist and will help us start a second phase of the project: ethnographic studies for further understanding the actual needs of the people involved with the project.

Photo via Joshua Salazar, CC BY-SA 4.0.

Offline-Pedia also includes other contents: what are people most interested in, and what would you like to see made available?

Offline-Pedia contains the default tools provided by Raspbian — Scratch, Wolfram Mathematica, and so on —  and as much as we could download from the Kiwix zim Packages available (Vikidia, PHet Chemistry and Physics Simulations, the Gutenberg library; all except the TED talks, which were very large.)

People really loved to have offline Wikipedia access, because it has basically articles about whatever they want to know about.

I would like to implement an offline working wiki software to allow people to directly have a wiki experience: to create, edit, and discuss about the content. So later, after we return periodically to check whether everything is running nice and smooth we can take the new created content and upload it to the Wikipedia. That would expand the range of action of Wikipedia to the zones where even internet isn’t yet available.

How do you see these devices impacting the lives of people they are shared with?

I hope people will become more motivated to read and look for answers whenever they have doubts. Since I started to give speeches with Jorge Vega, another member of the Offline-Pedia project, I wanted the people to understand the importance of learning about science and technology, and how those aspects are vital for the economic, academic, and human development of our communities and country. But most importantly, making them aware of the enormous freely available knowledge for every single human on Earth, freely available on Wikipedia.

What’s next for your project? If you dreamed big, where should it go?

We want to work together with the Ministry of Education of Ecuador, for replacing the expensive budgets of the projects they have, and also developing some kind of software that could allow people to learn the minimum topics required for completing the Basic Education Program of the Ecuador’s Education Curriculum.

But my biggest dream is to facilitate the access to the Wikimedia content to every single place on Earth, where there is no internet access.

How else can Kiwix or the Wikimedia movement help you?

Spreading the word! We will be preparing some video-blog and tutorials where any tech-enthusiast can download the required packages, download the blueprints for cutting wood on CNC machine for the box, and understand the process of installation, and even sharing their experiences wherever they installed an Offline-Pedia device working with Kiwix and even further content that we may be developing in the meantime.

What resources exist for people who want to know more?

Right now, we just have the video that the cool guys of the Communication Office of our University helped us record while installing the device in the first rural community. Soon we will be finishing a blog with everything related to the project.

Interview by Stéphane Coillet-Matillon, Kiwix

Video at top by University Yachay Tech, CC BY-SA 4.0.

Don’t panic! Build your own Hitchhiker’s Guide with Wikipedia

Photo by David Strine, CC0.

You may have missed Towel Day in late May, but you know that sooner or later someone is going to build an interstellar highway right where you live.

This, or you really liked The Hitchhiker’s Guide to the Galaxy and think it is pretty much a description of Wikipedia that was well ahead of its time. H2G2, as it is often shorthanded, was originally broadcast on the BBC in 1978: in case you missed it, the series (and subsequent books) tell the (mis)adventures of Arthur Dent, last human survivor after our planet was destroyed to make way for a hyperspace bypass, and Ford Prefect, a human-like alien writer for The Hitchhiker’s Guide to the Galaxy. And they start their journey, in fact, by hitching a ride onto a passing Vogon spacecraft.

Believe it or not, it does all makes sense.

Forty years later, there are now almost as many versions of the H2G2 article as there were language editions of the book (about 30). Add a few dozen articles related to the author, the 1984 video game, the radio and movie’s theme tune, and of course all major characters, and you will know more than you ever cared to a few minutes ago. And because we like to do things thoroughly, there are also tributes in Wikidata and even its recent, very beta, lexicographic extension.

The good news (well, at least for the Wiki part) is that there is also a way to make your dream of owning your very own Hitchhiker’s Guide come true: the sum of all knowledge with you, everywhere, without the need for an internet connection (towel not included)!

Enter Kiwix, the offline Wikipedia reader supported by the Wikimedia Foundation and Wikimedia Switzerland, and David Strine, an enthusiastic H2G2 fan, a product manager at the Foundation and enterprising Kiwix user in his spare time.

Kiwix works like a browser, allowing more than 3 million people around the world to store and access on their devices an entire copy of Wikipedia, as well as free-licensed works from the Gutenberg library, TED videos, and much more. Once the content of your choice has been downloaded (or copied from a friend’s), no internet connection is required and you are free to go the the most remote parts of the Galaxy if you so wish (or closer, but just as poorly connected: we know of a couple of Kiwix users in Antarctica, for instance).

Kiwix runs on both Android and iOS and it is, of course, free—as in free speech (ie open-source) and free drinks.

For his own purpose, David used his Nook tablet, but any would presumably do. The whole how-to can be found on instructables and is fairly straightforward: buy a tablet, fit it in a cover, install Kiwix and contents, and you will be set.

A few things to keep in mind:

  • The English Wikipedia is big, as in five million articles and 80 gigabytes big—and that’s only if you’re using Kiwix’s .zim compressed file format. That said, microSD cards able to hold 80 gb are usually built to hold 128 gb, so you might as well include Wikivoyage (an actual travel guide) and Wikisource (public domain books), and still have some room to spare. This is a good thing, since there are nearly 800 more files in 100 languages to choose from.
  • The larger the storage size, the higher the cost. You can actually save some space (and therefore money) by using the “nopic” version of each zim file. With no illustrations, files are usually three to four times lighter.

The closest the Kiwix team ever had to a real-life galactical hitchhiker was a German seaman who would come and download a “fresh” version of Wikipedia every two years or so (and send us a thank you note for the updates). We also have the odd round-the-world traveler who sends a little hello.

But even if you use Kiwix from home or at school, we love stories: we forward them to our volunteer coders so they know their work matters. So if you are a Kiwix user and would like to share some love, do send an email to stories@kiwix.org. We’re looking forward to reading them!

Stéphane Coillet-Matillon, Kiwix

Gabriel Thullen on bringing offline Wikipedia to West African schools

Photo by Gabriel Thullen, CC BY-SA 4.0.

Senior Program Manager Anne Gomez leads the New Readers initiative, where she works on ways to better understand barriers that prevent people around the world from accessing information online. One of her areas of interest is offline access, as she works with the New Readers team to improve the way people who have limited or infrequent access to the Internet can access free and open knowledge.

Over the coming months, Anne will be interviewing people who work to remove access barriers for people across the world. In her first conversation for the Wikimedia Blog, Anne chatted with Emmanuel Engelhart (aka “Kelson”), a developer who works on Kiwix, an open source software which allows users to download web content for offline reading. In this installment, she interviews Gabriel Thullen, a Geneva (Switzerland) Wikimedian, previous Wikimedia CH board member, and school teacher who has worked with schools across West Africa to test the Kiwix offline Wikipedia browser during the 2016–2017 school year. As Gabriel writes in the Wikimedia Education newsletter, “These schools are in cities with limited access to the Internet and in small towns with little or no electricity, no cell phone coverage, and no Internet.”

Both Anne and Gabriel recently participated in the OFF.NETWORK Content Hackathon to advance Kiwix and its distribution of offline Wikipedia. Their conversation is below. You can also read the other parts of this ongoing series.

Anne Gomez: Tell me about what started your interest and involvement with offline Wikipedia? When was it?

Thullen: I consider that there were two phases in my involvement with offline Wikipedia. I was fascinated by the presentations at Wikimania 2012, about Kiwix and Wikipedia Zero. I really got into using Kiwix while helping distribute it at the WMCH booth during the 2013 Wikimania in Hong Kong, and was able to successfully secure a mandate from my employers for the 2014-2015 school year to investigate the pedagogical opportunities opened up by the use of offline Wikipedia and Kiwix. Unfortunately, we waited over a year for the Kiwix plug computers to be produced so that by the time the hardware was delivered to Geneva, the funds had dried up and the school year was over.

The second phase of my involvement started with a trip to Senegal in Summer 2014. As coach of a team of Senegalese wheelchair basketball players playing in different European championships, I was involved in organizing a tournament in Dakar. At the same time, I was checking up on the distribution of a few hundred math books I had shipped to Senegal, and I met with a few teachers and government officials. Having surmised the usefulness of offline Wikipedia for schools with little or no Internet access, I had brought a few dozen USB keys pre-loaded with Kiwix and the French Wikipedia. In all, a few weeks were spent meeting with schools and other facilities like the US Peace Corps training base in Thiès and the Naval Academy in Dakar. They were all quite enthusiastic and receptive when I shared Kiwix with them.

Gomez: What’s been the biggest surprise for you over the years?

Thullen: That must be the detrimental effect of smartphones on what I consider to be basic computer literacy such as knowing how to use a keyboard and mouse, knowing the difference between input and output, between local and remote storage. I have been teaching computer science since the mid-eighties, before the web was developed at CERN here in Geneva. Over the years, the students knew more and more about about computers before coming to school, but I have noticed a sharp reversal in this trend over the past two years as an increasing number of students no longer have a computer at home, all is done with mobile devices and a smart TV.

Gomez: Smartphones have transformed the way people can access the internet. How has this changed the landscape and the way you view offline access?

Thullen: My experience with offline access is limited to Africa. I have a research project which is currently stuck in Limbo due to the rules and regulations of my employer which absolutely forbid the use of smartphones in class by middle school students. I now feel that we need a two-pronged approach to providing offline access for smartphones. Users should be able to install resources directly on their device, or else they should be able to easily connect to a mobile WiFi hotspot which provides these resources.

Gomez: How do you see these devices impacting the future of educational resources?

Thullen: Making educational resources available to smartphones and similar devices only makes sense if the educators are ready and willing to make use of them, and that means that those who develop them need to work closely with the educational community. I see a huge potential for offline resources where I work, in Swiss schools. Switzerland is one of the most connected countries in the World, but in a learning environment you sometimes need access to certain resources but not to the Internet or to any other form of electronic communication – think of exams, for example. I am really excited to be part of what could turn out to be a paradigm shift in the way of teaching and evaluating students.

Gomez: You’ve brought Kiwix on a wifi hub (Raspberry Pi) and thumb drives to pass out to teachers in Senegal. Help me understand the context you’re working in… what makes one better than the other?

Thullen: The Raspberry Pi Kiwix hotspot I built turned out to have a few limitations: it needs a very regular power supply, which means that it is not available 24 hours a day, every day. On the other hand, a USB key is always available and needs no power source. One other factor that needs to be considered is that when users connect to the WiFi, they lose any Internet connexion they may have had. As many people in West Africa communicate by Facebook Messenger, this effectively isolates them from their friends and families. They might be willing to do this for a few minutes, but it might not be a long term solution.

The Raspberry Pi WiFi hotspot and the USB flash drives do not address the same needs. A WiFi hotspot can be used by a group in the same physical location whereas the USB sticks can be used individually and at any time.

Gomez: You and I have talked about the challenges of getting teachers to use and share Wikipedia offline, especially with technical challenges. Thumb drives can easily be wiped and repurposed for storing and sharing other types of content. How are you working with teachers to demonstrate the value of Wikipedia and Kiwix?

Thullen: Music takes up a lot of space. Videos take up even more space. Text takes up very little space. My colleagues have to understand just how much information can be stored on a USB thumb drive if you leave out the videos or the HD photos. It is not obvious to them, and it takes time and practice using the Kiwix program before users are convinced that the whole Wikipedia encyclopedia on that little drive, that “it is really all there” to quote a colleague. What I found out during my trips to Senegal is that a lot of teachers there do not know about Wikipedia, so the first step is to ensure that they are familiar with the encyclopedia, and that no matter how much they know about a subject, they can always find more information. That demonstrates the inherent value of the offline encyclopedia so that they are less likely to delete it and reuse the key for other purposes.

Gomez: What’s the hardest part of the technology to deal with?

Thullen: I have now hit a technological wall in West Africa. Please bear with me as I explain, as this probably applies to most other countries to one degree or another.

When I last went to Senegal, over 60% of the computers I saw were running Windows XP, [Editor’s note: Microsoft has released five versions of their OS since XP was released in 2001.] mostly in areas with little or no Internet access. There are two limitations of Windows XP which complicate the distribution of the Kiwix offline Wikipedia reader. The maximum USB size recognized under XP is 32 Gb, and the latest version of the French Kiwix & Zim files are just a little bit more than what can be loaded onto a 32 GB flash drive. Kiwix also uses a single index file which is approaching the 4 GB maximum file size for FAT 32 systems, and Windows XP uses FAT 32 flash drives. You need to know that the index file for the English language Wikipedia far exceeds this limit. To make a long and complicated story short, the latest version of the French Kiwix Wikipedia download I can share with my African colleagues is the one from June 2016…

I keep on getting a lot of well intention-ed suggestions: upgrade to Windows 7! – use a Mac! – use Linux! – don’t use FAT 32! – etc. That is fine for us computer geeks, but it definitely is not an option in a lot of parts in this World. As the old saying goes: “if it ain’t broken don’t fix it” – and that applies to computers as well.

Gomez: What’s the hardest challenge with content?

Thullen: When I present Wikipedia to my African colleagues, online or offline, I can’t help noticing the dearth of information about Africa. There is more information about the village of Genthod in Switzerland (2’700 inhabitants) that about the city of Thiès in Senegal (260’000 inhabitants). Our notability criteria are heavily biased towards what used to be called the “Global North”, and this is flagrant when examining the offline content that I am actively promoting in West African schools. The second-hand printed textbooks that are often used are centered on the French, Belgian or Swiss curriculum. My dream is that our online encyclopedia will soon be more inclusive and better reflect the diversity of our world.

Gomez: What do you see as the future of your on-the-ground work?

Thullen: The Wikipedia encyclopedia is built up by the base, that is by the end-users. The distribution of offline resources should follow the same pattern that made Wikipedia so successful. The distribution should be a large scale community effort, not something managed by a small committee. As for future of my on-the-ground work, I would like to be able to spend more time and energy helping local communities get organized so that they can distribute the offline resources themselves. Each local community will then be responsible for centralizing the downloads, copying and/or distributing the Kiwix files, end user training, financing the purchase of the USB flash drives.

Gomez: What’s one thing the Wikimedia Foundation could do to help?

Thullen: There are certain situations where the Foundation could be instrumental in getting Kiwix to those who would benefit – I was thinking about what is happening right now in Puerto Rico. I imagine that the large scale destruction they experienced touched the schools and libraries as well. As a teacher, I am quite concerned about the future of those thousands of students in the affected areas, and I am quite sure that not all will be able to go to the continental US. The Foundation could come forward with a solution which could be implemented immediately and at low cost, thus providing the students with the resource material they need to be able to continue successfully with their studies. This would raise public awareness and support for the Kiwix program.

Gomez: Where do you learn more and share information about offline access?

Thullen: Kiwix.org, obviously. There are relatively few projects centered on developing offline access. Like most of the free or open source software projects, the main problem most people have, ourselves included, is knowing that they exist. You could start by searching for projects like “Pirate Box” or “KA Lite” or “Project Gutenberg”.

Gomez: What resources exist for people who want to know more?

Thullen: After trying what was mentioned above, those looking for more resources can google “OER” (Open Educational Resources). A more original approach is to look up digital educational resources from the 1990’s since most of them were offline – try old computer magazines or old education related magazines. The WayBackMachine could probably help you find what you are looking for,

Anne Gomez, Senior Program Manager, Program Management
Wikimedia Foundation

The future of offline access to Wikipedia: The Kiwix example

Photo by Dietmar Rabich, CC BY-SA 4.0.

Senior Program Manager Anne Gomez leads the New Readers initiative, where she works on ways to better understand barriers that prevent people around the world from accessing information online. One of her areas of interest is offline access, as she works with the New Readers team to improve the way people who have limited or infrequent access to the Internet can access free and open knowledge.

Over the coming months, Anne will be interviewing people who work to remove access barriers for people across the world. In her first conversation for the Wikimedia Blog, Anne chats with Emmanuel Engelhart (aka “Kelson”), a developer who works on Kiwix, an open source software which allows users to download web content for offline reading. In the eleven years since being invented, a number of organizations have utilized it, including World Possible and Internet in a Box. Still, it’s perhaps best known for its distribution of entire copies of Wikipedia in areas of low bandwidth, like Cuba.

As we noted in a 2014 profile of Kiwix, the software “uses all of Wikipedia’s content through the Parsoid wiki parser to package articles into an open source .zim file that can be read by the special Kiwix browser. Since Kiwix was released in 2007, dozens of languages of Wikipedia have been made available as .zim files, as has other free content, such as Wikisource, Wiktionary and Wikivoyage.”

In addition to Wikimedia content, Kiwix now contains TED talks, the Stack Exchange websites, all of Project Gutenberg, and many YouTube educational channels. Anne and Emmanuel chatted about how video and smart phones are changing the offline landscape—and where Kiwix plans to go from here.

You can also read the other interviews in this series, including a chat with Jeremy Schwartz of World Possible.

———

Anne Gomez: A lot has changed in a decade. What can you do now that wasn’t possible when you started Kiwix?

Engelhart: A lot has changed, indeed. Around us, a lot more people now have broadband access, but 4 billions remain unconnected. At the same time, Internet censorship has increased. That’s not something we’d expected, and it forces us to constantly rethink offline access. On the Kiwix side, the technology has changed a lot and the project has become a lot stronger.

We now have a small and very motivated team of volunteers with a huge array of skills. Our budget, while still ridiculously low, has also increased and allows us to pay for services that are sorely needed to grow in scale.

Ten years ago, the dream was to create a technology to bring Wikipedia to people without Internet access. And we succeeded. But there still are too many folks out there who don’t know about the technology or can’t access it. Our next Big Dream, therefore, is to consolidate our solutions and be more efficient in bringing them to people who really need it.

———

What’s been the biggest surprise for you over the years?

I don’t know if I have a “best surprise ever” to tell… but I’m often impressed by the ingenuity and the resilience of our users. I think in particular about these people who travel, often in really precarious conditions, from school to school to install Wikipedia offline.

Another really dominant feeling I have is my gratefulness to the volunteers who make the project so lively. For the past 10 years, and now more than ever, they have joined and done what needed to be done so that free knowledge is available to all.

———

Smartphones have transformed the way people can access the internet, and recently you started building packaged apps for Wikimed. How has this changed the landscape and the way you view of offline access? How do you see these devices impacting the future of educational resources?

In general, I have mixed feelings about the smartphone/tablet ecosystem: On the one hand, it has done a lot to make computers and internet access more affordable to people. It has also allowed for new kinds of softwares and features. And that’s good.

On the other hand, most ecosystems are closed or proprietary, making software development pretty expensive. They also tend to treat users as consumers and encourage that mindset. I see it as a real issue, in particular for collaborative and participatory movements like Wikimedia.

Most of our audience at Kiwix does not own a computer at all, and probably never will; our priority therefore is to have a great mobile-friendly portfolio. That’s why we spent the last two years developing dedicated apps for Android. These package Kiwix with a topic-specific content (e.g. Medicine or Travel, but soon also History, Geography, or Movies). Wikimed has been a huge success, and showed us the way forward. The big learning for us has been that users search for easily actionable content rather than super powerful technologies. When it comes to offline, size of content does matter as you don’t want to download something you don’t need.

By bringing learning resources and tools at any time and to (almost) every corner, mobile devices have definitely helped people win a bit of freedom. That said, the software engineering challenges are still pretty big and a lot of resources are still needed to make sure this paradigm shift will benefit everyone.

———

What hasn’t changed?

To be honest we really would love that a technology like Kiwix someday becomes obsolete. But unfortunately, this is not going to happen anytime soon. In some case, it might even become worse. We are concerned that censorship will soon become the #1 problem for those who want to access free knowledge.

———

Video makes up over half of global bandwidth, and from the New Readers research we know that lots of people prefer to learn by video, but it’s expensive to store for offline. How are you thinking about video and other media?

I tend to think that the pedagogical value of videos are overrated. Being lazy myself, I might also prefer watching a video that using other means for learning. That does not mean this is the most appropriate way.

That said, there are lots of legitimate use for video, and in general we try to stay away from editorial discussions: we only want to focus on building the best technology. And the ZIM format that Kiwix relies on is anyway content-agnostic: this means that you can use it to store whatever content you like. We are actually already distributing dozens of offline files with videos embedded in them.

But of course the reader needs to be able to display efficiently these videos. So far, Kiwix does it but it could be better… This is something we have been working on and will keep on working on in the near future. Hopefully our effort on this will be over next year when we release a new version of Kiwix for Windows/Linux.

———

Kiwix supports more than just Wikipedia—how do you think about what content packs to include?

We always search any content which is free (as in free speech). Most of the time, ideas come as feature requests from our users/partners.

———

Kiwix is foundational to a number of other offline educational projects (IiaB, RACHEL, etc.). How do you balance supporting end users and reusers?

We try to support both as much as we can, but we consider integration projects like the ones you mention, as well as those of other deployment partners, to be the key to accessing a broader audience. They are therefore privileged because of the scaling effect they give us in terms of distribution.

———

What resources exist for people who want to know more?

We are a software project, so most of the activity is visible on our code forge (Github) at:

We also have chat channels on Freenode IRC #Kiwix (http://chat.kiwix.org) and on Slack #kiwixoffline (http://kiwixoffline.slack.com).

People can also always send us an email, if only to say hello, at contact[at]kiwix[dot]org

Anne Gomez, Senior Program Manager, Program Management
Wikimedia Foundation

Thanks to Melody Kramer for writing the introduction to this piece.

Kiwix-JS 2.1, portage sur Windows Mobile et Ubuntu Touch, et perspectives

L’activité est très soutenue ces derniers temps sur Kiwix.

On a sorti une version 2.1 avec quelques améliorations, et surtout il y a de nouveaux contributeurs qui sont très actifs. Ca fait plaisir et devrait déboucher sur des améliorations de performances et des portages sur Windows 10 mobile et Ubuntu Touch.

    

Version 2.1

Elle est sortie début juin. En plus de quelques correctifs et améliorations mineurs, elle inclut un important refactoring du code pour faciliter la compréhension pour les nouveaux contributeurs.

Entre autres, la compatibilité avec le format d’archive d’Evopedia a été abandonnée, ainsi que du code inutilisé. Cf https://github.com/kiwix/kiwix-js/releases/tag/2.1.0.

Renommage

On a renommé le projet pour que ce soit plus proche de la réalité : « Kiwix JS » plutôt que « Kiwix HTML5 ».

Nouveaux contributeurs

Plusieurs personnes se sont intéressées au projet ces derniers temps : sharun-s, puis Jaifroid. Ils sont bien rentré dans le sujet, et ont contribué du code de qualité. Et puis dattaz a travaillé aussi sur l’utilisation de WebAssembly, qui parait une piste très prometteuse.

Moi qui me trouvais seul, me voilà très sollicité par des questions, des issues, des Pull Requests etc. Ce qui fait que mon activité de développement est supplantée par une activité de coordination/organisation (« Project Leader », pour être un peu prétentieux…).

Ca change un peu, et surtout c’est très motivant de voir que le projet suscite de l’intérêt ! Un grand merci à ces contributeurs de participer à le faire vivre.

Version pour Windows 10 mobile

Jaifroid a déjà une version qui fonctionne pas mal. Il compte proposer l’application sur le store Microsoft très prochainement.

Cette application (de type UWP) pourra également fonctionner sur Windows desktop. Mais, sur desktop, et malgré les optimisations en court, elle sera certainement plus lente et aura moins de fonctionnalités que la version classique de Kiwix (développée en C). Pour éviter que les 2 applis soient confondues, elle sera probablement nommée « Light » ou quelque chose de similaire. Par contre, elle devrait avoir l’avantage de pouvoir tourner sur Windows 10 S (qui ne permet que l’installation d’applications depuis le store Microsoft).

Jaifroid a proposé plusieurs patchs qui ont été fusionnés dans kiwix-js. Mais il reste des choses spécifiques à Windows (l’encodage des fichiers en UTF-8 avec BOM, par exemple), qui sont dans un repository à part : https://github.com/kiwix/kiwix-js-windows/.

Version pour Ubuntu Touch

ernesst a fait une application pour Ubuntu Phone, à partir du code de kiwix-js (sans le modifier). Elle est déjà disponible sur https://open.uappexplorer.com/app/kiwix, même si elle ne s’appuie pas encore sur une version releasée. On a réussi à ajouter la génération du package (.click) pour Ubuntu Touch dans l’intégration continue du projet (sur Travis).

Vous allez me dire que Ubuntu Phone a été abandonné par Canonical : c’est vrai. Ce n’est probablement pas ce portage qui va ramener beaucoup d’utilisateurs (même si https://ubports.com/ permet de continuer à l’utiliser), mais peu importe.

Améliorations en cours

En plus de diverses améliorations techniques ou ergonomiques, Jaifroid et sharun-s se concentrent surtout sur la partie performance. Ca tombe bien, il y a une bonne marge de progression sur ce sujet.

Les essais qu’ils font donnent déjà d’excellents résultats. La difficulté va être de choisir les optimisations les plus appropriées, et les intégrer proprement dans le code.

On peut espérer que cette future version 2.2 apporte des gains significatifs.

Perspectives

Une partie du travail actuel sur les performances se focalise sur le mode jQuery (qui est encore le mode par défaut). Ce mode a le gros avantage de fonctionner partout. Les ServiceWorkers ne sont pas encore supportés par Edge (le moteur du portage Windows), ni par les extensions Firefox, et ça reste à vérifier sur Ubuntu Touch. Donc on n’a pas trop le choix pour l’instant.

Mais sur le moyen/long terme, je pense que l’avenir reste :

  • côté backend : la compilation de la libzim en WebAssembly. Si on y arrive, ça permettrait de bénéficier de toute la richesse fonctionnelle de la libzim, ses optimisations etc. Mais ce n’est pas encore supporté par tous les navigateurs/moteurs
  • côté IHM : le passage sur des solutions techniques plus adaptées que jQuery pour l’injection des ressources : ServiceWorker et/ou WebRequest (la techno proposée par Mozilla en remplacement des ServiceWorkers qu’ils ne supportent pas dans une extension)
  • côté appareils mobiles : l’utilisation d’Apache Cordova pour avoir des APIs génériques permettant de lire les fichiers ZIM sans intervention utilisateur, et éviter les implémentations spécifiques à une plateforme

Il est également envisageable de créer des « custom apps » comme c’est fait sur Android. C’est-à-dire une appli Kiwix qui embarque le contenu ZIM avec. Cela a eu du succès sur Android (ex : Wikimed avec des données médicales : https://play.google.com/store/apps/details?id=org.kiwix.kiwixcustomwikimed). Un avantage est que ça éviterait à l’utilisateur de devoir télécharger puis sélectionner son fichier ZIM (on devrait pouvoir le lire directement). Mais ça ne pourrait fonctionner qu’avec des fichiers ZIM peu volumineux.

Hackathon Kiwix avril 2017

Début avril, j’ai participé à une semaine de hackathon pour le projet Kiwix, pour sortir des extensions de navigateurs permettant de lire (hors-ligne) les fichiers ZIM (wikipedia ou autre).

Contexte

J’avais déjà fait ça à Berlin l’an dernier : https://blog.mossroy.fr/2016/01/23/hackathon-kiwix-a-berlin-debut-2016/

Cette année, ça se passait à Lyon, on était une douzaine. On a eu un super temps, c’était vraiment agréable.

Super ambiance, j’ai rencontré des gens vraiment intéressants. Il faut dire qu’il faut un certain état d’esprit pour poser une semaine de congés pour participer bénévolement à ce genre de choses…

Les revues de code croisées sont vraiment enrichissantes, ainsi que les quelques réunions qu’on a faites sur l’expérience utilisateur et les tests automatisés (notamment grâce aux retours d’expérience de Julian Harty). Ca m’a aussi permis de progresser un peu en anglais parlé, comme il y avait plusieurs nationalités représentées.

Objectifs et résultats

J’avais pour objectif de réutiliser le code de l’application kiwix-html5 (initialement faite pour Firefox OS), pour en faire une extension pour les navigateurs Firefox et Chrome. En soi, ce n’était pas très compliqué, j’ai donc aussi travaillé sur l’intégration continue avec Travis, et sur quelques améliorations (le détail : https://github.com/kiwix/kiwix-html5/milestone/14?closed=1)

C’est disponible pour Chrome : https://chrome.google.com/webstore/detail/kiwix/donaljnlmapmngakoipdmehbfcioahhk

L’extension a également été soumise pour Firefox, mais leur processus de validation semble beaucoup plus long. Quand ce sera prêt, l’extension sera disponible ici : https://addons.mozilla.org/firefox/addon/kiwix-offline/. En attendant, les pressés peuvent installer une version « nightly » : http://download.kiwix.org/nightly/2017-04-09/kiwix-firefox-signed-extension-2.0commit-bbe7610.xpi (mais elle ne sera pas mise à jour automatiquement : à n’installer que pour faire des tests).

Statut et suite

C’est une première version. Elle n’est pas très rapide, ne couvre pas toutes les fonctionnalités des autres clients Kiwix, et n’est pas compatible avec tous les fichiers ZIM (j’ai surtout testé avec wikipedia et autre contenus wikimedia).

Mais elle a le mérite d’exister, et ne marche pas si mal. J’espère qu’elle rendra service à des gens, et redonnera vie à ce code via du feedback utilisateur, et (sait-on jamais) via des contributions, parce que je me sens un peu seul sur ce code… ;-)

Il y a plein de choses à améliorer ou à corriger, j’ai mis quelques idées pour la version suivante : https://github.com/kiwix/kiwix-html5/milestone/6

Un point qui ne pourra hélas pas être amélioré à court ou moyen terme, c’est l’obligation pour l’utilisateur de sélectionner manuellement le(s) fichier(s) ZIM à chaque fois qu’il ouvre l’extension. Pour des raisons de sécurité, les navigateurs n’autorisent pas une page web à ouvrir elle-même un fichier du filesystem (qu’elle aurait retenu dans un cookie, par exemple). Dans une extension, je trouve que ça aurait du sens de l’autoriser (avec l’autorisation de l’utilisateur, bien sûr), mais ce n’est pas encore prévu : https://bugzilla.mozilla.org/show_bug.cgi?id=1246236.

No internet? No problem! Kiwix celebrates ten years of offline Wikipedia reading

Photo by Zack McCune, CC BY-SA 4.0.

Photo by Zack McCune, CC BY-SA 4.0.

When your goal is to make the sum of all human knowledge available to everyone, how do you ensure that people can actually access it? For many Wikipedia readers around the world, the problem may be that internet access is either slow, censored, or even non-existent—and those with a limited phone plan know that using too much data can really hurt their monthly bill.

These are exactly the kind of situations that Kiwix was meant to address: an open-source software that allows people to access a full copy of Wikipedia for offline reading. Wherever you are, wherever you go, you can have Wikipedia with you.

Kiwix was created exactly 10 years ago in Switzerland, with the support of Wikimedia CH (Switzerland). It was initially intended to burn the information onto DVDs. At the time, the alternative for offline knowledge was generally limited to Microsoft Encarta. Times and technologies have changed—Encarta, for instance, was discontinued in 2009)—but Kiwix has endured and prospered. Every year, more than a million people worldwide download and use it to access Wikipedia (in more than 100 languages), other Wikimedia projects like Wiktionary or Wikivoyage, educational videos like TED talks, or play with educative science simulations like PhET.

Connectivity in many places around the world is not exactly simple, something demonstrated recently when Google released a new Lite mode for some of its Android products to lighten the amount of data transferred, arguing that in countries like India 2G networks still are the norm. With Kiwix, the only limitation is the initial download, which is usually done on a USB flash drive or microSD card, then copied and circulated offline. After that, people are free to go and carry a piece of internet with them.

We have also developed a Raspberry plug that creates its own local network for up to 25 to 30 users at the same time: nothing to transfer, just bring your wifi-enabled computer or smartphone and access free knowledge like you were sitting in Zurich or San Francisco. These are already very much in demand in West- and Southern-African schools, and we’re looking forward to rolling them out in refugee camps across the Middle East.

Photo by Rama, CC BY-SA 3.0 FR.

Photo by Rama, CC BY-SA 3.0 FR.

Last but not least, we’ve also started to adapt to the growth of mobile by releasing Kiwix for iOS and Android. For the latter, we went one step further and started making smaller dedicated apps: Wikivoyage has become a fully portable travel book, and we released an app that contains every medical article on Wikipedia—in English as well as half a dozen other languages—with the volunteers at Wikiproject Medicine. We’re told that physicians on the Indian subcontinent love it.

After fifteen years of existence, approximately 500 million unique visitors visit Wikipedia every month to learn about pretty much anything, thanks to the work of thousands of volunteer editors. But four billion people out there still do not have a reliable access to internet and cannot benefit from this accumulated wealth. Kiwix turns ten today, and it has already gone a long way to bridging that gap. We’re looking forward to doing better over the next ten years.

Stéphane Coillet-Matillon
Kiwix and Wikimedia CH (Switzerland)

Version beta de kiwix-html5

Suite au hackathon Kiwix de janvier, l’implémentation HTML5 commence à être utilisable. La cible initiale était Firefox OS… qui n’a plus d’avenir. Mais le code pourrait servir à d’autres plateformes, à terme.

kiwix-logo

Pour rappel, Kiwix est un projet qui permet d’accéder à tout le contenu de Wikipedia (et d’autres) en mode hors-ligne.

Après la résolution de quelques bugs bloquants, nous avons pu sortir une version beta de son portage en HTML5/javascript. Beta veut dire imparfaite : elle est lente, ne supporte pas encore tous les types de contenu (pas les vidéos ou ce qui fait appel à du javascript, notamment), et encore buggée.

Mais elle ne marche pas si mal :

example-article-gravitation example-article-gravitation-2 example-article-gravitation-3

Par rapport à Evopedia, la première différence visible pour l’utilisateur est que les images sont stockées dans l’archive. Et ça, c’est vraiment cool. Par contre, ça alourdit forcément l’archive : compter environ 20 Go pour toute la version française de Wikipedia (il existe également une archive sans image, qui ne fait que 4 Go).

Gestion des dépendances

Le support des « dépendances » (javascript, css, images etc) n’a pas été facile à mettre en place, et n’est pas encore parfait.

Dans Evopedia, c’était plus simple puisqu’il n’y en avait pas : seule un feuille de style « en dur » était appliquée à tous les articles. Sauf que le format OpenZIM est beaucoup plus générique : c’est une bonne chose, mais ça complique le développement puisqu’il faut « injecter » ce contenu dans chaque article.

Pour cela, 2 méthodes ont été envisagées :

Méthode jQuery

Il s’agit de « parser » la page HTML pour essayer d’y reconnaître les dépendances, et les remplacer via jQuery par leur contenu (après l’avoir lu dans l’archive).

Avantage : fonctionne sur n’importe quel appareil
Inconvénients : la reconnaissance est forcément imparfaite, et le DOM est bien « secoué » par ces injections

Méthode ServiceWorker

L’API ServiceWorker est relativement récente. En simplifiant un peu, elle est disponible sur Chrome 40+ et Firefox 44+ (mais pas sur la version 45 ESR), et avec des outils de debugging depuis Firefox 47.

L’idée est d’avoir un sorte de « proxy » entre l’article et le web, qui permet de remplacer les requêtes vers l’extérieur par des lectures dans l’archive.

Sur le papier, cette méthode est largement meilleure que la précédente, et correspond parfaitement au besoin.

Mais il y a plusieurs problèmes :

  • Un ServiceWorker est complètement isolé du DOM et du reste de l’application. Il ne peut donc pas directement lire dans l’archive, ce qui oblige à mettre en place un protocole de communication entre le ServiceWorker et le backend
  • L’API ServiceWorker n’est pas activée par défaut sur Firefox OS, et ne le sera probablement jamais (sauf peut-être dans le futur B2G OS repris par la communauté)

Avantage : méthode propre et exhaustive de gérer les dépendances. A l’usage, elle est apparemment plus rapide que la méthode jQuery
Inconvénients : indisponible sur Firefox OS (sauf à l’activer manuellement via adb), implémentation encore instable (je parle de la mienne dans kiwix-html5, pas de l’API des navigateurs)

Dans le cadre de cette version de kiwix-html5, les deux sont disponibles, mais on a dû privilégier la méthode jQuery pour pouvoir avancer. Pourtant, l’avenir est certainement plutôt sur le ServiceWorker.

Impact de l’abandon de Firefox OS par Mozilla

A quoi bon faire une application pour Firefox OS puisque cette plateforme est abandonnée par Mozilla ? (sur smartphones, en tous cas).

Oui, je me suis bien sûr posé la question. Clairement, cela limite énormément la portée de l’application Firefox OS.

Mais j’ai continué pour plusieurs raisons :

  • D’abord, je n’aime pas laisser en plan ce que j’ai commencé, alors qu’on était si près d’une version beta
  • Ensuite, je connais des personnes qui en ont l’utilité, ont un appareil sous Firefox OS et comptent le conserver autant que possible (c’est mon cas, notamment… )
  • Enfin, c’est du HTML5/javascript donc a priori portable vers d’autres plateformes. Le seul problème est l’accès aux fichiers ZIM qui est fortement restreint par les navigateurs pour des raisons de sécurité

Statut du Marketplace Firefox OS

J’ai soumis cette version beta au Marketplace de Firefox fin juin 2016. Temps de validation estimé à une vingtaine de jours.

Mais fin juillet, toujours rien. En me renseignant sur IRC, des employés de Mozilla me disent que plus personne ne travaille sur la validation des applications du Marketplace. Gloups : je pouvais attendre longtemps…

Je me suis trouvé bien dépité, et sans solution pour déployer mon appli au grand public.

Ce n’est vraiment pas cool de la part de Mozilla de ne pas afficher clairement le statut du Marketplace. Pourquoi continuer à accepter les soumissions s’il n’y a personne pour les valider ? Il y a un gros manque de communication officielle là-dessus. Mozilla semble oublier trop rapidement qu’ils en ont quand même vendu, des téléphones, et qu’ils en ont mobilisé, des développeurs. Et tout ce beau monde se retrouve un peu abandonné.

Bref, si quelqu’un de Mozilla daigne regarder (on ne sait jamais), l’appli sera disponible ici : https://marketplace.firefox.com/app/kiwix/ , mais pour l’instant je n’ai pas d’autres solution à proposer que de l’installer manuellement via WebIDE (en se branchant en USB via ADB), à partir des sources : https://github.com/kiwix/kiwix-html5.

Quelle suite ?

Il y aurait plein de choses à faire : la roadmap est sur Github : https://github.com/kiwix/kiwix-html5/milestones

Mais il ne sert plus à grand-chose de mettre de l’énergie spécifiquement sur l’application Firefox OS. A court/moyen terme, cette plateforme va disparaître.

Alors, quoi faire de ce code ? Sur le principe, il peut servir de base à un portage sur d’autres plateformes (mobiles ou non). Il n’y a que l’accès aux fichiers ZIM qu’il faut « porter » : le reste est commun et multiplateformes. Les évolutions qu’on apporterait bénéficieraient à toutes les plateformes.

Après discussion avec Kelson, deux pistes nous paraîtraient intéressantes à envisager :

  • Mise en œuvre en tant qu’extension Firefox/Chrome/Edge (ces 3 navigateurs pouvant utiliser à présent des APIs quasiment identiques)
  • Mise en œuvre en tant qu’application desktop, en passant par un framework comme Electron pour l’accès aux fichiers