Low-Cost Internet Access Using Mechanical Backhaul

MALE SPEAKER: I’m very happy
to introduce Professor S. Keshav today to talk to us. Keshav has at various times
been an academic, an industrial researcher, and a
Silicon Valley entrepreneur. And now, he’s moved back to
academia at University of Waterloo in Canada, where he’s
working on, among other things, the problem of getting
low-cost, reliable internet access to India, and has
come up with a novel way of doing so. S. KESHAV: Thank you. I should add that this work is
joint work with Aaditeshwar Seth, who’s my PhD student. And so all the good ideas
belong to him. And if something doesn’t look
quite right, that’s my contribution. So that’s the way we are. So let me start off by saying,
what is a kiosk. A kiosk is, I think
of two things, technology and a person. Technology because you want to
do some communication or something special, and a human
operator because the people who are using this aren’t
quite up to using it themselves. Maybe they’re illiterate. OK? So you need a human operator. Today, kiosks provide
a variety of government-to-citizen
services. And I have a list over here
which you can read. We’re working with a partner
called eGovServices, who are based out of south India. And they provide 200 different
services, including actually mutual funds, crop insurance,
railway reservations, and so on and so forth. And I’ll talk a little
bit more about how the technology works. But this is actually
out there. They’re actually
well-established means of delivering technology
to large countries. The public call offices, or
what in India is called STD/ISD offices, there
ae about 150,000 of them right now. And they’re all profitable. And they all actually support
one enterpreneur. So you have 150,000 who are
making a living out of this. And based on the success of
this particular model, the Ministry of Information
Technology in India is proposing to set up 100,000
kiosks over the next two years. And there are several people who
are trying to bid for this opportunity. So this is actually something
which is happening. And the bids are going out. And what we’re trying to do in
this research is to actually make these kiosks useful. So to make a kiosk useful,
it has to be connected. If it’s not connected, it’s not
doing any work for you. And the three options that are
out there for connectivity are dial-up, VSAT, a Very Small
Aperture Terminal, and a long-range wifi. So dialup is your typical
phone modem. You’re going at between 8 and
28 kilobits a second. And the problem with this is
that you’re going over a wire. And in a hot, dusty environment
where people are digging all the time because
they’re building buildings and roads, you have outages. And these outages take fairly
long times to fix, typically three to four days. So if a kiosk is done for three
days, and somebody comes and say, I want my birth
certificate, or do some kind of financial transaction,
you’re going to turn people off. And in fact, project after
project in India and other countries, the kiosks have died
out because people go to the kiosk and shut down because
it doesn’t work. So part of what I’m trying to
do here is to address that problem and say, how can you
make kiosks that are low-cost and reliable. And you can talk about that both
from the operating system perspective and the
communication perspective. The VSAT, or Very
Small Aperture Terminals, are popular. These are pizza-sides dishes
which you can put on the roof. The problem is the upfront
cost is fairly high, plus there’s a monthly rental fee. And the monthly rental
fee’s also high. So if people who do this kind
of thing with VSATs end up making a loss, they’re not able
to sustain themselves on the kind of transaction fees
that they’re extracting, which is roughly between $0.05 and
$0.15 a transaction. So you can’t really do that. It’s not enough money
to pay for the VSAT. And spare parts are
hard to get. And long-range wifi, a couple
of people are trying to do this work out of Berkeley,
[UNINTELLIGIBLE], and so on, to take 802.11b and make it very
long range and make it low-cost. It’s experimental. And the up-front expense is not
in the wifi card, as it turns out, but in the tower
cost. Trees there are roughly 16 meters, in most parts
of India, 50 feet. And you need 18-meter towers. And 18-meter towers are way more
expensive than 15-meter towers, which are way
more expensive than 10-meter towers. So the cost goes up
exponentially. So the big cost is
in the towers. So a tower costs roughly $1,000,
which is essentially unaffordable for the kinds
of deployments we’re thinking about. 100,000 kiosks at $1,000
is $100 million. So what can you do? And this is something called
mechanical backhaul. And the idea is not mine. The idea is due to a project
called DakNet, which is done out of MIT Media Lab, about
three years ago. And as you can see on
the right hand side, they have this bus. And they actually put a wifi
access point on the bus, and a small device which
has storage. I don’t know if I can get
the mouse here, OK. On the left hand side, we have
these kiosks in the villages. And in each kiosk, you
have a wifi box. And what it’s doing is it
gives it to the bus. And the bus actually carries
it to the city, and it goes off into the city. And it goes round and round,
very much like what a postman would do. And the term mechanical
backhaul is due to [? Arno Penzias, ?] who said, oh, this is just doing
backhaul, but it’s doing it mechanically. So I love the term,
and I stole it. So both the idea and the term
are stolen, which isn’t a normal research practice. So this gives you the
intuition of what we’re trying to do. And what I’m going to show you
in the next few slides is that while it sounds simple, here,
you just put a bus an access point, in fact, you have to
reexamine the very foundations of networking, things like
naming, addressing, routing, locationing, all of these things
are completely changed. And so I’ll talk about
an architecture. I’ll tell you about the
open problems and our implementation. And just to give you a sneak
preview, we’re actually going to deploy two of these in
May, in south India. So this is not just talk. We’re actually going
off and doing this. So what are the goals? In academic research, it’s very
normal to come up with random assumptions and goals. But you’re trying to build
something for the real world. You have to make sure the goals
match what’s out there. Otherwise, you’re building
the wrong thing. So these are the goals we have.
The first one is low cost, low cost because you
want to have mass scale. So we want to have less than
$250 per kiosk up front and less than $50 a month
operational costs. And this $250 includes the
PC or the machine that’s providing the services, as
well as all networking. Everything has to
be in the $250. That’s fairly stringent. You want it to be reliable
because, as I said earlier, if it’s unreliable, people
just get turned off. They don’t want to
use it anymore. We want to allow user mobility
because it turns out, as social studies have shown, that
villages actually move in about a 15-kilometer radius. And it’s not as surprising as
you would think, because they’re going to the market
towns to sell their produce. They’re visiting relatives
in other villages. They may be peddlers or
small-time traders who are picking up goods from other
towns and bringing them to villages and so on. Mobility is necessary so that
the villagers, or the people using this, are able to use the
kiosk that is closest to them instead of being tied to
a particular one kiosk. They can go anywhere
and pick up stuff. We want to have data privacy to
allow banking transactions, maybe medical records,
financial transactions of any sort. So we need privacy in the
specific sense of authentication and
message privacy. We want to be able to use
existing internet services because a lot of stuff
out there– people want to use Google,
no matter where they are. But the servers on Google
are not aware of these connections, because
we don’t have a connection all the time. So how do you do that? We want to exploit every
possible connectivity option that we have. Some kiosks
will have dialup. Some will have dialup
and long-range wifi. Some will have VSAT. Whatever is there, we
want to use it. And so we have to have an
architecture that allows you to specify this is an
urgent message. Send it on a cell phone. Or this is a message
that can wait. Send it on your wifi whenever
the bus comes by, and so on and so forth. And finally, we want to support
two classes of users, kiosk users who are going to
a kiosk and using something there, and people who have
their own devices. It could be health care workers
or government workers who have a PDA and so on
and want to use it. So I’m spending a lot of time
on the goals because as you can see, if I don’t have the
right goals, then I’m going to do the wrong thing. So if there are any questions
about this, I’ll answer that now. Does it make sense? OK. So why is it hard? If I have these goals, what
makes it difficult? The first thing is that we don’t
really have a connection end-to-end. If somebody’s on a PC on the
kiosk, and they want to access email servers, say Hotmail, or
they want to do some web query, both ends of the
so-called connection are not simultaneously present. So you can’t really run TCP
end-to-end because guess what? All you have is TCP to the bus,
and then an hour or two later, you have TCP
from the bus to whatever is the backend. But you don’t have the same
thing end-to-end. So you can’t do things like
TCP window flow control end-to-end. You only have hop-by-hop. That fundamentally changes the
character of what you can do. So you can’t use standard
TCP/IP. You can’t use DNS because there
isn’t a DNS server you can talk to. All you can talk to is a bus. You can’t use SSL because
SSL has this three-way handshake, right? It says, here’s my
[UNINTELLIGIBLE]. You can’t use that because
there’s no– it’s all hop-by-hop assumptions. so that’s one change
in mindset. The second thing is that
there’s a bunch of technologies that are
out there which are mobile IP, HIP. There’s an alphabet soup of
things which deal with mobility, and some of them
deal with disconnection. But they typically assume you’re
mostly connected and rarely disconnected. Or you switch. Think of TCP Migrate,
for example. You can be aware of that. You’re saying, well, my IP
address keeps changing, but I’m connected all the time. And you just kind of flip
the IP addresses using a handshake, using your public
key or something like that. But here, we’re in the
opposite regime. You’re mostly disconnected and
rarely get connections. You’re transient, upload as a
bus goes past, and then you’re transient, and another
bus goes past. And finally, we have these
issues of low cost and reliability, which are very
stringent assumptions. And these change the design. And without going into much
detail, we did look at all the existing work, and we
couldn’t use them. So what is it we can use? What are the building blocks,
the LEGO blocks, if you will, that you can put together
to get what we want? And one is we can use storage. Storage is very cheap,
$2.00 a gigabyte. You can get maybe $1.00 a
gigabyte, maybe less if you’re buying it in bulk. Wireless networks everywhere,
and both the cellular kind of wireless networks and wifi,
these cards are cheap and ubiquitous. And then there’s this new
technology which has come out in the last few years called
delay-tolerant networking. It came out of the
Interplanetary Space research Group, out of JPL. But then there’s been a
considerable amount of work being done out of Intel Research
Berkeley by Kevin Fall and others. In a nutshell, what it has is
an overlay network on top of the internet. So what you’re talking about
is a hop-by-hop. And each hop is TCP/IP,
for the most part. And we send messages that look
like bundles, which are called bundles, which go
over potentially disconnected links. Which is exactly like the kiosk
to bus, and the bus to the next hop. And think of it like email. Email doesn’t require
you to be present. Both ends don’t need
to be present. You just dump it in
a mail server. It makes its way somehow to the
destination mail server. And then somebody picks it up. And that’s what DTN does,
except that it can do better routing. And I’ll talk about
routing in a bit. And finally, DTN allows you to
have extensible naming and addressing based on URIs. So everything’s a URI. And we do the name-based
forwarding. And so these characteristics of
DTN are very well matched to the kind of things that we
have. However, DTN does not address things like mobility,
does not address issues like security and routing. So we actually have extended
DTN to take care of those issues. So that was the big picture,
what are the goals and the principles and the techniques. And now, I’m going to dive right
into the architecture. So this is what it looks like. And here I’m going to wave
my hands because I don’t have a pointer. I’m not supposed to
use a pointer. And I don’t have a mouse
either, I guess. All right. So what they have on the left
over here is this bus route, A. And the bus route, A,
can you think of as a regular bus route. But you have a guy in a kiosk
who’s a regular user. And the bus picks up
the data and goes to an internet gateway. So the bus goes round and round,
picks up data from here opportunistically and then drops
it off at the gateway, which is the brick
wall over there. And there it enters
the internet. And the upload is typically
through a DSL at about 100, 200 kilobibts per second. So it’s not super fast,
but it’s OK. This over here represents a very
interesting case where the bus is dropping stuff off
at this intermediate kiosk. And then you have another two
kiosks, which are simply connected through this
intermediate kiosk over here. So the kiosk is actually
acting as a store and forward router. And in the third case over here,
bus route D is actually not talking to a kiosk. It’s talking to somebody
who’s at home. And they’re picking up stuff
along that’s in that person’s house as they go past. So all
of these are typical use cases, and we would like to
support all three of them. We also have users who
are going from one kiosk to another. And that’s the presenter in
the bottom over there. So that’s the architecture. So what does it look
like in a kiosk? And what we have is a design
that we want to make work in very limited circumstances. So what we do is we give the
kiosk controller, the entrepreneurs running the kiosk,
what we call a kiosk controller with this
Soekris box. Those of you who’ve seen it,
it’s about eight inches, about an inch thick. It has no fan. It has a low-power CPU and a
couple of PC card slots. And it’s headless, keyboardless,
no video out. It takes only seven watts. So we can take a 10-watt solar
panel, stick it on the roof, put in a lead-acid cell, a gel
cell, and power that for five, six hours based on that. And what it does is that it has
inside it a RAID 1 disk, a mirrored disk, and provides
NetBoot and a file system and WLAN access. So what the enterpreneur does
is to go buy a junk PC, a secondhand PC, costs about $100,
sticks it into this box and hits reboot, and
it comes up clean. The box is running
Debian Linux. The Soekris, the PC comes
up running Linus using terminal services. It’s actually hacked-up
terminal services. First thing to contrast
this is with One Laptop per Child project. As you know, MIT is trying to
do this thing with this one laptop every child is given. And the difference, contrast
is immediate. We have one recycled PC per
village versus one laptop per child, so the cost now gets
about a factor of 100 away. We’re 100 times cheaper
right off the bat. The second thing is One Laptop
per Child will get dust on it, and it’ll break down. It’s going to happen. Somebody leaves it in the
sun for five minutes, it’s going to fry. Where are you going to
get spare parts from? Right? Whereas here, spare parts
of PCs, they are in fact everywhere, because there are
about 700 million of them, and people just have these things. And getting recycled PCs
is kind of easy. How many of you have an
old PC in your garage? OK. Chances are pretty good that
if they ask for donations, they’ll have 20 in
five minutes. My lab is full of
recycled PCs. I just ask once, and people
are throwing it at me. It’s easy to get these things. So for sure, they will
run pirated Windows. That’s what’s going to
run on these PCs. They will run on it software
which is not quite the right thing to run. And it’s going to have viruses,
adware, spyware, malware, whatever. It’s going to have all sorts of
junk in it, and that’s OK. What you do is you put in the
ethernet cable and press reboot, and it boots
from the net. And it’s not going
to use your disk. It’s just going to
use RAM disk, and it’s going to be clean. So that’s the way
we designed it. And this is all working. So we can take any old PC. We don’t care what’s
on the disk. In fact, you can junk the disk
and just boot off the Soekris, and you’re OK. So we have, as I said, spent
some time thinking about what are the environmental conditions
under which this is going to be used. Inside ferries, or the buses,
we call them ferries, data ferries, and gateways, the same
kind of thing, we have a box with some RAID on it. Here is now, it looks like from
a networking perspective, is a graph. What is shown over here on the
top is this cloud, which is the legacy server,
whatever server. And it goes to a proxy. We have a proxy to hide
these connections from the rest of the world. So from the perspective
legacy server, all you have is a proxy. And I’ll go deeper into
what’s inside the proxy in just a minute. These I1, I2, I3 and so on
are internet gateways. And these dashed lines represent
slow links, slow meaning about 100 kilobits per
second, as opposed to the wifi links, which are 10 to 15
megabits per second, considerably faster. These rings called B1, B2,
B3 are the bus routes. And this actually opens up a
very funny question, which is, how do you draw a network where
the nodes are moving. OK? When you talk about a network
graph, you draw a graph and expect everything to
stay where it is. But here, I have a node which
is going all over the place. And in fact, I don’t think this
is the right way to do it, either, OK? Because– I can talk about that
offline further. But representing statically the
bus routes turns out to be very problematic when you’re
trying to do any kind of link-edge, link weight-based
routing topologies or routing metrics, OSPF or something
like that. Because depending on different
points in time, the cost of the links changes, the
time-varying cost per link. Failure probabilities
also change. So it gets messy. But for the moment, this is a
first-order approximation that seems to work. And as you can see, we have User
2 over here going from one kiosk to the other kiosk. One is on two bus routes,
and so on. And one can imagine constructing
shortest paths on a graph such as this. And in fact, it does do the
right thing, though there are some problems I’ll
talk about later. So this is just to give you an
idea of how the whole thing models as a graph. So this is a protocol
architecture. And bear with me. It’s not as horrible
as it looks. On the left, we have
the kiosk. And I’m showing an
SMTP example. So under this line over here is
your standard SMTP, hello, from, to, data, quit. Right? Standard stuff. What’s happening in the kiosk
controller is running smtpd, and is running SMTP Demon,
essentially saying, I’m the mail server. And it takes fakes Hotmail,
whatever, Gmail. It says, I am who you want me to
be, and gets all the data. And what it does is it puts them
into this protocal called OCMP, which is our own protocol
called Opportunistic Connection Management
Protocol. And OCMP essentially grabs the
data, puts it into persisent storage on a MySQL database. The reason is because
a Soekris bus is going to go down. We have five watts of power,
but it may be a cloudy day. So when it comes back,
it’s just fsck or the equivalent of that. And we know what bundles
are out there. So we’re power fault tolerant. And then we have this thing
called a DTN bundle protocol agent, which essentially grabs
the bundles, sends them to a network of routers using DTN
routing, and off it appears into the proxy. And in the proxy, you have this
applicaiton ID that says this is email application. And they are plugged in on the
other side which gets the data and sends it to the regular
email server using SMTP, which is over here. So essentially, what we’ve done
is taken a normal TCP/IP connection and put it into these
bundles or messages and replaced the whole thing under
the cobwebs, so that neither the client nor the server needw
to know the difference. OK. So any questions about
this so far? OK. So now the talk gets hairier. This is the easy part. So we’re going to talk about
naming and addressing. And let’s see, OK, we have
about half an hour. All right. So we use names for
everything,. Ferries, kiosks, gateways
all have a name. And for us, a name is just a
string, any string you want. It could be your name
and address. It could be your phone number. It could be your
email address. It doesn’t matter. For uniformity, we just hash
it, and we get a 20-byte string, which we use
for forwarding. So it’s a name-based system
where all the forwarding is done on names. Clearly you see right away, you
don’t need DNS, because all you’re going to do is to
simply hash your name, and you get your address. And off you go. We call this a GUID,
Global Unique ID. Now we have this complication
of this connection, right? This connection says you can
send it to somebody, but they’re not there. They’ve turned off their PDA. Or you send it to the kiosk, and
the kiosk is down because there is no power available
right now. So what we do is we have this
notion of a custodian. And a custodian is somebody
who’s usually supposed to be on all the time. It’s exactly like your
[email protected]_server. You’re not there, but your mail
server is there for you. So that’s a custodian. So we have this notion
of a custodian. And I said that everything
has a GUID, so a custodian’s a GUID. And so the full name of somebody
is going to be the custodian GUID, user GUID. It’s just like [email protected]_server. That’s my GUID. There’s a custodian GUID. And the same [INAUDIBLE]. And what the sender’s going
to do is to send it to the custodian’s GUID somehow. And then after that, the
custodian will figure out how to send it to the user. So that’s how you deal
with disconnection. And the custodian also acts as
an anchor point, which is shown in this picture
over here. So for the regular kiosk user,
the custodian is right here, because that’s where you’re
going to pick it up from. But a mobile user will pick a
custodian over here, which is the closest point under
which you’re moving in terms of the sub-tree. So wherever you are, just pick
it up from the custodian. So when you go to this village,
you say, OK, I’m picking up from here. When you’re in that village,
you pick up from there. And everything is cached there
for you or staged there for you to pick up later. And in the case of Bus Route D,
the custodian actually is the bus because the bus is the
point which is going to give you your stuff. So the notion of custody
is fairly nice. It allows us to deal with
disconnection properly. Now I’m not talking about
mobility yet. I’m just talking about
disconnection. However, we have to deal
with mobility. And what mobility means
is that your custodian can change. Right? Because you go to different
place, and your custodian changes. So we need to somehow tell
senders that I’m at a different custodian. And how do you do that? You use an old trick which cell
phones have been using for 20 years, which is just when
you go to home location register, it says, if you want
to reach me, I’m available at so-and-so location. And what we mean by location
here is the custodian’s location, because that’s what’s
going to get to me. So what we do is that we have
this home location register, which is this big look-up
table in the internet. You can use a disputed hash
table if you want. And every user says my GUID is
X, and my custodian is Y. So if anybody wants to reach me,
tell them to send it to Y. And you put that in the Home
Location Register. Now the sender may not know why,
because the sender may be disconnected also. The sender is some
other village. So what we do is that we have
a special custodian name called “unbound,” which says,
I don’t know who the receiver’s custodian is. I’m just going to
put the unbound. And whenever the bundle gets to
somebody on the internet, then that person can bind it at
that point in time to the right custodian. So you’re sending this message
with the GUID of the destination, but with
no custodian name. But then halfway through, it
figures out where it’s really supposed to go. An analogy that might help is to
send IP packets with names in the header. And when it gets too close to
DNS, it resolves into an IP address, sticks the IP address
in, and then you know where to go. We’re doing the same kind of
thing, except we’re doing it name-to-name translation,
unbound to a name translation. So how do you set it up? Well, that’s done through
signaling. It’s a standard control
plane in a cell phone. When you turn on your phone, a
register message is sent to your what’s called a visitor
location register, which says, so-and-so cell phone is in
so-and-so tower area. Same way over here, we send a
register message which goes, quote unquote, “towards”
the internet. And the way you do towards is
by configuring every router with a default route that says,
this is my default route to go to the internet. It’s what we do today. So by just setting one
configuration parameter in every DTN router, you
know exactly how to get to the internet. And the register message flows
up to the internet. And then you update the Home
Location Register, saying this is the custodian. And so that’s the solution
in a nutshell. There are quite a few
race conditions. And there is a paper I can share
with you, which has all the solutions to race
conditions, which are a little bit tricky. OK. So that gives you the flavor of
the naming and addressing. And so to summarize, we have
to have custodians to deal with disconnection. Users have names. Names are any string
that you want. Forwarding is done based
on the names. | do you do forwarding
based on names? That’s routing. OK? And that’s what I’m going
to talk about next. So we’ve just assumed that
magically, the sender has a GUID and magically reaches
its destination. But this is actually
pretty hard to do. And I’m going to tell
you three solutions. And when I show you the third
solution, I’ll tell you why it’s hard. So the first one is just
flooding, which is just dumb. Give everything to everybody. And it actually works,
right, by definition. It also finds a shortest path,
because it finds all paths. But it’s not something you’d
really want to do. But for very small deployments,
just a couple of buses and 5, 10 kiosks,
why not. Just send it to everybody. The second choice is what we
call reverse path forwarding. And it’s easy to understand, but
conceptually, it’s tricky. So let me show you what you’re
doing over here. So imagine the user U5 in the
far right corner who I said is registering with
the HLR, right? That’s what you do when you
change your location or you go to a new place when you come
up for the first time. So U5 is going to send this
reg message along this particular default route all
the way to the internet, saying, hey, register me. Now what we do is very simple. Let’s say you look
at kiosk K3. When K3 gets a message from
Bus 3 saying U5 is registering, it says, oh, U5
is registering through you. Therefore, if I want to
send to U5, I should just give it to you. Just swap the route road back. This is a very old trick that
dates back to the ’60s, I believe, ’67 or something
like that. And it’s called reverse
path forwarding. It says if I’ve got something
on this port, I must be able to reach you on that port. And you just do that
recursively, and you get to the top. And so the registration, which
is a location update, is actually coopted to do routing
table as well. So you’re doing routing and
locationing at the same time. It sounds very clever, but it
has a fatal flaw, which is that it’s a single spanning
tree, which means that it’s susceptible to failure. So in this case, for example,
Bus 2 goes through internet gateway I2 and I3. But it has decided that the
default route is I3. So if I3 goes down, though I2 is
a perfectly reasonable way to go, you can’t use
it, because that’s not the default route. So there are simple hacks to get
around that, but generally speaking, because you’re
constructing a single spanning tree, you get efficiency,
because it’s a single copy, not multiple copy,
but it’s fragile. And so the one hack, again, or
one way to get around that, is to have a soft state. Every half hour, you send
a new register message. So even if you’re down, you’re
down for half an hour. And that’s exactly what
we’re doing on our implementation today. But it’s not very satisfying
because this is not very good. What you really should be doing
is link state routing. This is the holy grail, which is
to represent each link in a link state packet as you’re
doing OSPF, and the flood it. So everybody constructs their
own topology, constructs shortest routes, and so on. There are two problems
over here. One is that I don’t know how to
draw a topology for a bus. The ring I showed you is a first
approximation, but it’s not very good, because imagine
that there’s a bus, and there’s a user who’s just
picking up data from the bus as it goes past anywhere
along its route, anywhere along its route. How do you represent that? What is a link metric that goes
from the bus to the user? It depends whether
the user was. And in fact, the routing policy
you should have depends on where the user was. If you catch a bus just as it’s
setting out from the bus depot, which is the internet
gateway, then the best routing policies is for the bus to go to
another bus which is going back towards the bus depot in
the opposite direction, across the street, so that it goes
back to the depot. However, if you catch the bus
towards the end of its duty cycle, then it should
just keep it. So the bus has to make this
kind of decision. And we would like to make the
decision using standard OSPF and standard link metrics,
but I don’t know how to draw a graph. So this is really kind of
funny, because I’ve been studying networks for nearly 20
years, and I’ve never had a situation where I don’t know
how to draw a graph. Well, anyway, maybe somebody
can help me here. The other thing is determining a
link metric is hard because, as I mentioned to you, the
uplink is a DSL or something like that, that I3
uplink is a DSL. So that’s about a gigabyte a
day, ballpark, gigabyte a day. The buses carry about 40
gigabytes in their disks. So if you dump the whole
contents of a bus, you have 40 days of queuing at one router. It sounds crazy, but
it’s possible. So you want to be very sure of
which particular gateway you’re choosing. However, the link metric, in
this case the queue length, changes before you get there. So the bus may decide that when
it gets here, it says, oh, I3 is heavily loaded. I2 is lightly loaded. So I’m going to send on I2. But by the time it gets there,
somebody may have dumped stuff on to I2. So The metric has changed
on you, because the time scale is hours. Some of the bus, some of the
opportunistic connections dump more stuff in it. So managing these link costs
becomes quite important. And pathological cases are very
easy to construct, when you have ping pong routes
or you have very large delays and so on. The solution that we have in
mind is to use used cell phones and use them as a serial
modem and hook them up to our Soekris boxes and send
at 8 kilobits per second, which is the GPRS rate,
routing updates. So because the real underlying
problem is that the time scale at which your routing things are
changing is the time scale of forwarding. Another way of putting it is
when you do TCP flow control today, you do window
up and window down. You add it to get a decrease,
but the path is fixed, the route is fixed. So I’m changing on you. Those routes are changing on
the order of five minutes. And the TCP’s changing on the
order of 100 milliseconds. Here, the rate at which data
arrives and the rate at which the routing metrics are being
updated are the same time scale, which is why you
have this problem of time scale coupling. We need to decouple it. And the way to decouple it, as I
said, is to use cell phones. And that’s one way
of looking at it. But load-sensitive routing,
which is really what you have to do when you have open-loop
flow control– you can’t do closed-loop flow control because
the round-trip time propagation delays are
on the order of days. So you have to do open-loop. So open-loop flow control with
load-sensitive routing is a known hard problem. And for those of you who
remember the old ARPANET metric, the load-sensitive ping
metric, will recall that it had horrible oscillation
problems. And we want to avoid that. And so we need to
figure it out. We don’t have a solution yet,
but we have some good clues. And this is our current work. OK, so how about security? Now we’re switching
gears here. Why do we need security? Well, as I mentioned earlier,
you want privacy and authentication for banking,
secure email, bill payment, reservations, and so on. Obviously, you want
nonrepeatability and things like that. What we’ve looked that is why
not use PKI, mainly because I don’t know the public
keys of everybody. I have to– if I go ask the PKI,
give me the public key of so-and-so, well, it’s
hard to find. You could use certificates,
but then you have some revocation problems, because
you’re going to tell everybody your working is a problem. And so what we’re looking at is
a different solution which is based on identity-based
cryptography. This is something that
came out of Stanford about five years ago. And the idea is very simple. Your public key is your ID. Remember, your ID is just
a hash of your name. So if your name is [email protected]
or [email protected], then I just hash it. That’s a public key. If I know a public key, then I
must introduce a secret to create the private key,
because if there’s an algorithmic way to go from
public to private, everybody knows your private
key as well. So there has to be additional
secret. And that additional secret is
injected by something called the private key generator. So this is sort of
how it works. You go to a kiosk and say,
I want to sign up. I want to sign up as a user. And then the kiosk guy says
OK, your user name is so-and-so at so-and-so
provider. And by the way, hash it. That’s your public key, and
that’s also your address. All three are the same. And then the private key has
to be given to you by a private key generator. And we have a protocol, which
I’ll probably skip in the interest of time. It allows the private key
generator to give you your private key over a disconnected
network. OK? So the nice thing is that if I
know somebody’s name or their phone number, I can send
them a secure message. I’ll establish a secure channel
with them with the property that the private
key generator can spy on everybody, which is a bug or a
feature depending on which country you live in. In this country,
it’s a feature. In Canada, it’s a bug. So given this thing, you can
actually communicate with anybody you want to, because you
just need to know they’re Bank 1 at Main Street,
wherever. You take that whole
string, hash it. And as long as everybody agrees
that’s the bank’s name, you actually have a secure
channel to them, which is really great. So you’re standing by the
roadside, and you want to send a secure message to anybody. You don’t need DNS. You don’t need PKI. All you need is to know how
to hash, which is easy. There are some schemes we’ve
come up with on how to do give a disconnected user
a private key. And there are some interesting
corner cases to worry about because the kiosk guy is
not to be trusted. He’s somebody– he’s just a random person
who wants to make money. So how do you make sure that guy
isn’t using the people’s bank accounts to make
unauthorized withdrawals? So we have a scheme that uses
essentially shrinkwraps, cellophane wrappers, around
the SIM card like things to do that. We have to worry about
revocation. If you have a smart card or a
RFID which represents your key, and you lose it, you
don’t want people to be able to use it. And so we have some schemes
for revocation. And we also have schemes for
mutual authentication, where a user and a bus, a kiosk
and a bus, can authenticate each other. So you’re standing on the street
and a bus goes past, you can be sure that
it’s a valid bus. And the bus can be sure
it’s a valid user. And they can audit you. And they can also set
up a trail of usage they can bill you. So I’m going to skip those
things and go on to the next part, which is application
support. I had given you a preview of
the application support earlier where I said that
we had the email. You had this server pretending
to be the mail server, and it’s using something called
Opportunistic Connection Management Protocol. So I’m going to talk now for
supporting native applications or supporting applications which
are the server-side, which are pretending to be
native applications. So you want to have a simple
API to use it. You want to have last session
persistence, which means that the client server has some
session going, a file upload session, and the session
persists despite many buses going past. So each bus carries
a part of the session, and that’s OK. It’s all put together
the right way. You want to be able to choose
which network to use. If you say I want to use a cell
phone link, even though it’s expensive, because it’s
high priority, you want to be able to choose that. And finally, we want to be
allowed legacy servers. And the solution is this
thing, OCMP, and it’s a Java-based application. So you can run it on
smartphones, and you can also run it on Kiosk controllers. And you know that actually,
you’re on a bunch of different platforms. The one nice idea is to hide
disconnection under a directory API. And the idea is that you drop
something today in your directory, and it magically
appears in some other directory somewhere else. This actually was modulated by
the Plan 9 file system, which allowed you to use directories
for mapping FTP, so the FTP mapped stuff which– I think, Dave, maybe you
did that way back when at Bell Labs. We said, oh, that
is a cool thing. So we’re doing the same
thing over here. You have each application is
associated with an upload and a download directory and has
a status and a config file. So the config file
says what are my parameters, user name, password. And each application can simply
drop stuff into the upload directory, and it
magically appears in the upload directory in
the corresponding server on the proxy. And then the proxy plugin
on the other side knows what to do. So it makes it very easy
to write apps. And as we’ll show you in a
minute, we’ve done about six apps already. And I can get a student who
has no exposure to this to write an app in about
two hours training. All they need to do is grab a
file, drop it in a directory, and then forget about it. So most of the work actually
is writing GUIs. We use proxies. And we have this ability to
relocate open connections, which is not particularly
relevant to this project. But we’re using it in
a different project. So this is now the
complex slide. And I’m going to spend maybe
five minutes walking through it, because it’s kind of cool. At the top left, I have
the application. And just look at the middle
line which says directory watcher plugin. So what the app is doing is
it’s dropping stuff into a directory and then
forgets about it. The nice thing about this is
that it encourages people to have non-chatty protocols. You don’t want to have
request-response type things going over to directories. Because you drop in a directory,
you don’t know when the response is going to come. So it forces people to
do the right thing. The directory watcher plugin
is part of our OCMP implementation. It simply watches
change times. And when it sees something over
there, then it segments it into bundles and puts it into
persistent storage, into MySQL database or equivalent. You don’t really care what
database you use. The reason is because this
thing can die, right? When you come back up again,
you just recover, as I mentioned earlier. Below it, we have this thing
called a connection pool. And [UNINTELLIGIBLE] connection pool is a
bunch of sockets. Each socket is bound to a
particular NIC, a deep dialup NIC or a GPRS NIC, or in
this case, a bundle protocol agent NIC. So we have a pool
of worker bees. And I can say, here,
send this bundle. And it just goes and
sends the bundle. So all the applications share
a single connection pool. So we only really have one
and two in connection. And as those of you recall
from [? HTTP 1.1 ?] having persistant TCP is a good
idea when you’re talking about dialup, so you don’t
have to go through a slow start each time. So the connection update
is sitting over there. And what they have is the
ability for when you choose to send a bundle, we do an upcall
into the application saying, look, I’m about to
send this bundle. Where do you want me
to send it to? And that allows you
to have very fine-grained selection policies. Now, what the policies
actually are is still current work. But we have the mechanism to
choose exactly which nick you want to send it to. And when it gets into that, it
goes– in our case, the BPA, the bundle protocol agent,
is running on the kiosk controller, also running
on the bus. So it gets the bundle and puts
it into local persistent storage, the disk, drives it
over, and it comes to the internet gate, which is also
running a bundle protocol agent, and often goes over the
internet, over the DSL line, to the proxy. And the proxy gets the bundles,
reassembles them, and basically calls the application
plugin. So what OCMP’s doing is the
ability to provide plugins, to do segmentation reassembly. It does proxy to client
session persistence. And it provides the nick
selection choices. So these are all the things
that OCMP is doing. And you can think of it as
essentially application support for DTN. OK? This is implemented, works. Oh, great, that’s
the next slide. So we’ve implemented all of our
stuff as an extension to the DTNR’s, the DTN Research
Group implementation, which is a C++ implementation. So we’ve added all the
application support to it. We also added the reverse path
forwarding support to it. The application support
is in Java. It runs on PDAs on
[? Dream, ?] on any standard Pocket
PC 2003. And the Home Location Register
that I mentioned to you is actually using OpenDHT from
Intel Research at Berkeley. We’ve made some simplification
with the first implementation. The first is that we’ve
restricted custodians to be internet gateways. So all internet gateways
are also custodians. We’re using reverse path
forwarding routing. We’re not doing load-sensitive
routing. We haven’t implemented the
cell-phone-based control plane, though we’d like to. And we’re not doing any
data replication. We could easily do Tornado
accords or UDP [? instead of ?] TCP. But there are some funny
interactions between data replication and flow control or
open-loop flow control that we haven’t quite figured out. So we aren’t doing
that right now. But we have made these
following changes. We have a separate
control plane. So we have an app called the
control plane app, which makes it very easy to say, this
control plane app will always use a cell phone. So this data plane, control
plane separation is something that I really believe in. And so we’ve done
that over here. We have extended the DTN
namespace, the URI namespace, as I mentioned, to have the
custodian ID and the user ID. One problem is when a user
walks up to a kiosk, they don’t know anything about
20-byte strings and all this other stuff, right? What we in fact give them is
UNIX IDs, just standard UIDs. And you say your UNIX
ID corresponds to this 20-bit string. So the end user is
never actually exposed to these UIDs. They’re just getting
standard logins and passwords on a Linux box. And so that allows us
to keep track of the file system very cleanly. And we have a file called NCID
map that mounts from a UNIX ID to a GUID. And as I mentioned earlier,
all the recycled PCs just mount the root file system
from the Soekris box. And so they inherit the
same UID architecture. So we don’t have to worry
about IP password. We just get one UID, and
everybody’s happy from there. We just use Webmin
to create UIDs. And finally, the Home
Location Register is implemented using OpenDHT. In the original DTN
implementation, there’s no mobility support. So we had to add that in. All of this stuff is actually
done and working. OCMP is implemented in Java. We’re running it on Linux,
Windows XP, WinCE, and a student at AIT Delhi working
with us has ported it to .NET compact, so it should run on
the .NET Windows Mobile platform, whenever we
get our hands on it. The code is open source. It’s being used already by a
student at UC-Davis for an RSS news reader. Sprint is using it for HTTP. And IATW is using it for
XMPP, Jabber or GTalk. These are the applications we
have running right now. We have a mobile blog. And this was the first
thing decided. The whole idea was to take a
photograph or enter some text, put it on your PDA, and then
walk past a Starbucks, right? You can do the same
thing over here. You can walk past a kiosk. So you’re a journalist, or
you’re some random person who wants to upload things. You can just go past a kiosk,
and it’ll actually sync up with the Soekris box, put
it in a user directory. The bus will come and
take it away. I was talking to a student
yesterday at Berkeley about the situation happening in
Rwanda about malaria control. They don’t know who has malaria
in which district. So they have these shipments of
malaria control drugs which come in and just randomly send
it to random districts, because they have no clue
where it’s going. They have no communication
infrastructure in place. So when you say somebody’s
sending these things, one thing we’re trying to do, and I
hope it will work out, is to give these guys in rwanda a
Soekris-based thing in the car they can drive to the
district with. And it will simply run
on their cell phone. And you could actually do a
survey and say, what is the incidence of malaria in this
district, and have it collated and be uploaded and go to the
central government, which actually knows where to
send malaria medicine. Right now, they’re just doing
whatever they feel like doing. It’s pretty horrendous. We’re looking at Jabber, not
because you could do instant messaging with a
one-day delay. Instead of is it instant
messaging, or what is it? But Jabber is a nice app because
it has three modes. You can do file transfer, you
can do email, and you can do instant messaging. So you get all the three
kinds of things. And you can actually do instant messaging on top of dialup. So you want to actually separate
out within the app that these bundles go on dialup,
those bundles go on wifi, those bundles go
on GPRS or whatever. So we can actually think about
doing those things. So Jabber is a nice test case. We have HTTP gate for
obvious reasons. We’re working on email. And I’ll take a minute, because
this is kind of an interesting– we’re using
Thunderbird, and we’re internationalizing. We’re just doing
internationalization extensions. And this gentleman showed up
at my office two weeks ago, saying I can do any language
you want using six keys. And he’s figured out a way to
use six keys to enter 12 different languages. He can do five Indian
languages. He can do Chinese, Korean,
Bengali, Gujarati, Hindi, and Inuktitut, which is
what the Inuits speak in northern Canada. And he does it with six keys. And he just saw my website
and showed up. I said, can you do Telugu,
because a deployment is going to be in Uttar Pradesh? And he said, give me two days. He comes back two days later,
and he has Telugu. He doesn’t read or write the
language, but he’s a speech pathologist, and he knows
how to do these things. And he taught himself
basic programming, Visual Basic or whatever. And he’s writing this stuff. So we’re going to have an
English keyboard with six keys with stickers on them, which
will be used to enter Telugu. And we’ll provide Telugu email
in our deployment. So this is kind of nice. So if you’re interested,
send me an email. I’ll send you– he has a website
with a video of how he does these kinds of things. Flickr upload is underway. Flickr is a great way, and if
you’re looking for photographs of various water collecting
for malaria prevention programs, you can take
photographs and actually upload them using this. So Flickr is a nice
way to do that. So as I mentioned, the DTN2
extensions are done. OCMP’s done. The kiosk controller
is ready to go. And if anybody’s interested,
we’re writing up a tech note on how to make your Soekris box
work this way, so you can just replicate this. Security’s being developed
standalone. We need to integrate that. It’s going to happen in the
next couple of months. And we’ll do a first deployment
in two villages in Uttar Pradesh in May. So we’re going to go
off and do this. Conclusions, OK, so what
were the goals? We wanted low cost. So the
Soekris box costs about $200, give or take, depending
on what you put in it. The recycled PCs were $100. So we’re roughly in the $250
ballpark, which is not bad, because a village of 500 users,
it’s still $0.50 a person, which is OK, because
this includes communication plus computing. Reliability, I don’t think
we’ve got there. Because we need to do data
replication, and we got stuck with routing. So once we get the routing
figured out and the flow control figured out, we
can do reliability. But for now, I’m not
too happy with it. We allow user mobility. We allow data privacy
using the IBC stuff. We are able to use
existing internet services using proxies. We are able to use any
available network. So if somebody says, I have a
kiosk with long-range wifi and dialup, no problem. We’ll support that. And we are able to support
laptop and kiosk users. So we met most of our goals,
and I hope to meet the last goal by this fall. So bottom line, kiosks
are a good idea. Despite all the bad press that
they get in some quarters because of failures, I think
they’re a good idea because they’re very cost effective. The cost per person is in the
$0.50 to $1.00 range. However, the failures are
because of inappropriate technology. People are taking fully featured
Windows XP boxes that take 70 watts of power. They have CD-ROM drives. And they put them in an air-conditioned room with a dialup. OK? That’s your standard, state
of the art today. All of these are bad. You should not using
full-featured Windows XP, because it has so many
viruses, worms, et cetera, et cetera. In Cambodia, people find that
60% of the network bandwidth on the dialup is used
for updating virus definition files. So this is a bad idea. They should not be taking
70 watts of power. They should be taking a factor
or order of magnitude less, like 7 watts. They should not be
using dialup. It is flaky. They should be using the buses,
which are going round and round anyway. You may as well just use them. So by using appropriate
technology, can solve the real problem. And the underlying problems,
when you try to build an appropriate analogy, are not
easy, just as the graph problem I told you about
is not easy. Just solving the time scale
coupling problem is not easy. So there’s fundamental research
you can do which is intellectually challenging but
also is very applicable in this particular situation. So we think ferries
are attractive. We have practical solutions
to some of these things, mostly working. I’ve told you exactly where
we aren’t working. And I’d like to acknowledge that
the DTN code comes from Intel Research at Berkeley. And OpenDHT is also from Intel
Research at Berkeley. And our funding is being funded
by a bunch of people. And you will note
the asterisk. Thank you. [APPLAUSE] S. KESHAV: Yes. AUDIENCE: So [INAUDIBLE] comparison between satellite
or even GPRS [INAUDIBLE]? S. KESHAV: OK, so the question
is, well, how about using satellite or GPRS? Satellite, you have the upfront cost of the VSAT terminal. The VSAT terminal cost, it
depends on where you get it from and so on. But you’re talking roughly
about $500 or so. And then you have a monthly fee
to rent the space in the transponder. And that’s anywhere from
$50 to $200 a month. This is costing you
basically nothing. GPRS, you pay per bit. You pay per bit. In Canada, for GPRS service, on
the lowest rate plan, I pay $23 a megabyte. On the highest rate plan,
I pay $3 a megabyte. In India, it’s cheaper. India, I think it comes to about
$0.30 a megabyte, or something like that. But still, you cannot imagine
transferring 10 pictures over GPRS. You would just go bankrupt. AUDIENCE: [INAUDIBLE] essential services, like life
and death certificates? S. KESHAV: Life and death
certificates, you can wait a day or two, because right now,
they have to go to district headquarters by bus, get the
certificate and come back, so it takes two days anyway. So you may as well
just use this. AUDIENCE: [INAUDIBLE]? S. KESHAV: So the question is,
can be do low-bandwidth transactional services
using GPRS? Answer is yes. This is mostly useful for higher
capacity, value-added services like email
with attachments. But if you do only
transactional, small kind of things, you can just use
a cell phone link. Though, by the way, that’s not
widely available either. The coverage is not as good
as you would like. Yeah, but we’re looking at
essentially beyond they very low-end transactions. Yes? AUDIENCE: You hear a lot about
people like [INAUDIBLE] dig trenches out in the country. So [INAUDIBLE] doing that? S. KESHAV: OK. The quesiton is, if people like
Reliance in India are digging trenches around
the country, how far is it from reality? India is, in fact, one
of the most wired countries in terms of fiber. Recent studies have shown that
every point in the country is within 25 kilometers
of a fiber pop. The fiber pops are not lit. They’re dark fiber. But they’re within
25 kilometers, which is quite amazing. Another statistic which,
actually, I found quite surprising, was 40% of the
world’s fiber is owned by India, including international
fiber. So India is fiber-rich. However, the issue is of can you
get to the villages from this, the backhaul, right? That’s really what we’re trying
to solve over here. We’re not saying we’re
going to have the high capacity uplink. But maybe you go from
DSL to OC3. But the last mile for 100,000
villagers is not easy. And that’s what we’re trying
to solve over here. Yes. AUDIENCE: So how does
this compare to dialup in terms of cost? S. KESHAV: How does it
compare to dialup? Dialup, you pay, what is it, one
rupee a minute for dialup? Here, you pay zero
rupees a minute. It’s free, right? There’s a bus going round
and round anyway. So other than their maintenance
costs, which depends on outages, there’s only
upfront cost. There’s no monthly fees. It’s just for free. AUDIENCE: [INAUDIBLE] number of bytes? S. KESHAV: Number of
bytes per second. OK. So the dialup typically
measures 10 kilobits per second. OK? This one, you can get up to– five minutes of bus
time is equivalent to one day of dialup. OK? That’s the ratio. So if you have two buses going
past, five minutes contact time, you’re twice as fast
as the rate of dialup. Actually, it’s faster,
because buses get there in a few hours. AUDIENCE: Will you
take one more? S. KESHAV: One more question. Yes. AUDIENCE: How many buses are
you talking about total? S. KESHAV: How many buses
are we talking– AUDIENCE: And are they
doing anything besides just this work? Are they carrying
people as well? S. KESHAV: Oh, right. So the question is are these
buses dedicated to this? No, these are just regular
road transfer buses that happen to be going past. What we
do is they have 12 volt DC. And we just take the 12 volt
DC, produce a power conditioner, and stick it into
a Soekris box that’s just sitting there. We need to potentially drill a
hole or go through a window for an external antenna. But that’s it. It’s an $8 antenna. AUDIENCE: How are you
[INAUDIBLE] the maintenance of the kiosk? S. KESHAV: The maintenance? AUDIENCE: The maintenance
of the kiosk. S. KESHAV: Maintenancce
of the kiosk. As I mentioned, we’re working
with this partner called eGovServices. And they have 80 kiosks in
operation today already. The way they work
is like this. They have 300 people in Delhi
who are doing business process outsourcing. And they have a team that’s
split off to provide the infrastructure. OK, these guys have written
an application front end. And they’re on Oracle
10g in the kiosks. And they do all the transactions
and store a file. And then they train the kiosk
operator to upload the file using dialup. That’s the problem right now is
that the upload is not very fast. So they can’t
provide email. They can’t provide
attachments. And they can’t do much more. What we’re doing is working with
these guys who already know how to run a kiosk and
getting them better plumbing. That’s our job. The kiosk training, entrepreneur
training, all that stuff, is being handled
by eGovServices. And they plan to scale to 4,200
kiosks in a year or so. So they are pretty well
capable of doing that. AUDIENCE: Does every kiosk
have its own operator? S. KESHAV: Yes, each kiosk
has its own operator. Yeah. OK. Thank you very much.

2 comments

  1. WOW ! Grehyound bus Token Ring… who would've ever thunk (pun intended) it !

    When you need tech support, will your call get routed to a guy like Roger in Ohio during the middle of the night who will insist his name is really Sandreep ?

Leave a Reply

(*) Required, Your email will not be published