Docsity
Docsity

Prepare-se para as provas
Prepare-se para as provas

Estude fácil! Tem muito documento disponível na Docsity


Ganhe pontos para baixar
Ganhe pontos para baixar

Ganhe pontos ajudando outros esrudantes ou compre um plano Premium


Guias e Dicas
Guias e Dicas


Linux advanced routing and traffic control HOWTO, Notas de estudo de Informática

Roteamento Avançado com Linux

Tipologia: Notas de estudo

2014

Compartilhado em 11/08/2014

xpto121520
xpto121520 🇧🇷

4.6

(27)

94 documentos

1 / 160

Toggle sidebar

Esta página não é visível na pré-visualização

Não perca as partes importantes!

bg1
Linux Advanced Routing & Traffic
Control HOWTO
Bert Hubert
Netherlabs BV
Thomas Graf (Section Author)
tgraf%suug.ch
Gregory Maxwell (Section Author)
Remco van Mook (Section Author)
Martijn van Oosterhout (Section Author)
Paul B Schroeder (Section Author)
Jasper Spaans (Section Author)
Pedro Larroy (Section Author)
piotr%member.fsf.org
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff
pf12
pf13
pf14
pf15
pf16
pf17
pf18
pf19
pf1a
pf1b
pf1c
pf1d
pf1e
pf1f
pf20
pf21
pf22
pf23
pf24
pf25
pf26
pf27
pf28
pf29
pf2a
pf2b
pf2c
pf2d
pf2e
pf2f
pf30
pf31
pf32
pf33
pf34
pf35
pf36
pf37
pf38
pf39
pf3a
pf3b
pf3c
pf3d
pf3e
pf3f
pf40
pf41
pf42
pf43
pf44
pf45
pf46
pf47
pf48
pf49
pf4a
pf4b
pf4c
pf4d
pf4e
pf4f
pf50
pf51
pf52
pf53
pf54
pf55
pf56
pf57
pf58
pf59
pf5a
pf5b
pf5c
pf5d
pf5e
pf5f
pf60
pf61
pf62
pf63
pf64

Pré-visualização parcial do texto

Baixe Linux advanced routing and traffic control HOWTO e outras Notas de estudo em PDF para Informática, somente na Docsity!

Linux Advanced Routing & Traffic

Control HOWTO

Bert Hubert

Netherlabs BV

[email protected]

Thomas Graf (Section Author)

tgraf%suug.ch

Gregory Maxwell (Section Author)

Remco van Mook (Section Author)

[email protected]

Martijn van Oosterhout (Section Author)

[email protected]

Paul B Schroeder (Section Author)

[email protected]

Jasper Spaans (Section Author)

[email protected]

Pedro Larroy (Section Author)

piotr%member.fsf.org

Linux Advanced Routing & Traffic Control HOWTO

by Bert Hubert

Thomas Graf (Section Author)

tgraf%suug.ch

Gregory Maxwell (Section Author)

Remco van Mook (Section Author)

[email protected]

Martijn van Oosterhout (Section Author)

[email protected]

Paul B Schroeder (Section Author)

[email protected]

Jasper Spaans (Section Author)

[email protected]

Pedro Larroy (Section Author)

piotr%member.fsf.org

A very hands-on approach to iproute2, traffic shaping and a bit of netfilter.

Revision History

Revision $Revision$ $Date$

DocBook Edition

Chapter 1. Dedication

This document is dedicated to lots of people, and is my attempt to do something back. To list but a few:

  • Rusty Russell
  • Alexey N. Kuznetsov
  • The good folks from Google
  • The staff of Casema Internet

Chapter 2. Introduction

Welcome, gentle reader.

This document hopes to enlighten you on how to do more with Linux 2.2/2.4 routing. Unbeknownst to

most users, you already run tools which allow you to do spectacular things. Commands like route and

ifconfig are actually very thin wrappers for the very powerful iproute2 infrastructure.

I hope that this HOWTO will become as readable as the ones by Rusty Russell of (amongst other things)

netfilter fame.

You can always reach us by posting to the mailing list (see the relevant section) if you have comments or

questions about or somewhat related to this HOWTO. We are no free helpdesk, but we often will answer

questions asked on the list.

Before losing your way in this HOWTO, if all you want to do is simple traffic shaping, skip everything

and head to the Other possibilities chapter, and read about CBQ.init.

2.1. Disclaimer & License

This document is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY;

without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR

PURPOSE.

In short, if your STM-64 backbone breaks down and distributes pornography to your most esteemed

customers - it’s never our fault. Sorry.

Copyright (c) 2002 by bert hubert, Gregory Maxwell, Martijn van Oosterhout, Remco van Mook, Paul B.

Schroeder and others. This material may be distributed only subject to the terms and conditions set forth

in the Open Publication License, v1.0 or later (the latest version is presently available at

http://www.opencontent.org/openpub/).

Please freely copy and distribute (sell or give away) this document in any format. It’s requested that

corrections and/or comments be forwarded to the document maintainer.

It is also requested that if you publish this HOWTO in hardcopy that you send the authors some samples

for “review purposes” :-)

Chapter 2. Introduction

2.4. Housekeeping notes ....................................................................................................................

There are several things which should be noted about this document. While I wrote most of it, I really

don’t want it to stay that way. I am a strong believer in Open Source, so I encourage you to send

feedback, updates, patches etcetera. Do not hesitate to inform me of typos or plain old errors. If my

English sounds somewhat wooden, please realize that I’m not a native speaker. Feel free to send

suggestions.

If you feel you are better qualified to maintain a section, or think that you can author and maintain new

sections, you are welcome to do so. The SGML of this HOWTO is available via GIT, I very much

envision more people working on it.

In aid of this, you will find lots of FIXME notices. Patches are always welcome! Wherever you find a

FIXME, you should know that you are treading in unknown territory. This is not to say that there are no

errors elsewhere, but be extra careful. If you have validated something, please let us know so we can

remove the FIXME notice.

About this HOWTO, I will take some liberties along the road. For example, I postulate a 10Mbit Internet

connection, while I know full well that those are not very common.

2.5. Access, GIT & submitting updates .............................................................................................

The canonical location for the HOWTO is here (http://lartc.org/).

We now have anonymous GIT access available to the world at large. This is good in a number of ways.

You can easily upgrade to newer versions of this HOWTO and submitting patches is no work at all.

Furthermore, it allows the authors to work on the source independently, which is good too.

$ git clone git://repo.or.cz/lartc.git or (if you’re behind a firewall which only allows HTTP) $ git clone http://repo.or.cz/r/lartc.git Enter the checked out directory: $ cd lartc.git If you want to update your local copy, run $ git pull

If you made changes and want to contribute them, run git diff , and mail the output to the LARTC

mailing list , we can then integrate it easily. Thanks! Please make sure

that you edit the .db file, by the way, the other files are generated from that one.

A Makefile is supplied which should help you create postscript, dvi, pdf, html and plain text. You may

need to install docbook, docbook-utils, ghostscript and tetex to get all formats.

Chapter 2. Introduction

Be careful not to edit 2.4routing.sgml! It contains an older version of the HOWTO. The right file is

lartc.db.

2.6. Mailing list ..................................................................................................................................

The authors receive an increasing amount of mail about this HOWTO. Because of the clear interest of the

community, it has been decided to start a mailinglist where people can talk to each other about Advanced

Routing and Traffic Control. You can subscribe to the list here

(http://mailman.ds9a.nl/mailman/listinfo/lartc).

It should be pointed out that the authors are very hesitant of answering questions not asked on the list.

We would like the archive of the list to become some kind of knowledge base. If you have a question,

please search the archive, and then post to the mailinglist.

2.7. Layout of this document .............................................................................................................

We will be doing interesting stuff almost immediately, which also means that there will initially be parts

that are explained incompletely or are not perfect. Please gloss over these parts and assume that all will

become clear.

Routing and filtering are two distinct things. Filtering is documented very well by Rusty’s HOWTOs,

available here:

  • Rusty’s Remarkably Unreliable Guides (http://netfilter.samba.org/unreliable-guides/)

We will be focusing mostly on what is possible by combining netfilter and iproute2.

Chapter 3. Introduction to iproute

Some parts of iproute require you to have certain kernel options enabled. It should also be noted that all

releases of RedHat up to and including 6.2 come without most of the traffic control features in the default

kernel.

RedHat 7.2 has everything in by default.

Also make sure that you have netlink support, should you choose to roll your own kernel. Iproute2 needs

it.

3.4. Exploring your current configuration..........................................................................................

This may come as a surprise, but iproute2 is already configured! The current commands ifconfig and

route are already using the advanced syscalls, but mostly with very default (ie. boring) settings.

The ip tool is central, and we’ll ask it to display our interfaces for us.

3.4.1. ip shows us our links

[ahu@home ahu]$ ip link list 1: lo: mtu 3924 qdisc noqueue link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00: 2: dummy: mtu 1500 qdisc noop link/ether 00:00:00:00:00:00 brd ff:ff:ff:ff:ff:ff 3: eth0: mtu 1400 qdisc pfifo_fast qlen 100 link/ether 48:54:e8:2a:47:16 brd ff:ff:ff:ff:ff:ff 4: eth1: mtu 1500 qdisc pfifo_fast qlen 100 link/ether 00:e0:4c:39:24:78 brd ff:ff:ff:ff:ff:ff 3764: ppp0: mtu 1492 qdisc pfifo_fast qlen 10 link/ppp

Your mileage may vary, but this is what it shows on my NAT router at home. I’ll only explain part of the

output as not everything is directly relevant.

We first see the loopback interface. While your computer may function somewhat without one, I’d advise

against it. The MTU size (Maximum Transfer Unit) is 3924 octets, and it is not supposed to queue.

Which makes sense because the loopback interface is a figment of your kernel’s imagination.

I’ll skip the dummy interface for now, and it may not be present on your computer. Then there are my

two physical network interfaces, one at the side of my cable modem, the other one serves my home

ethernet segment. Furthermore, we see a ppp0 interface.

Note the absence of IP addresses. iproute disconnects the concept of ’links’ and ’IP addresses’. With IP

aliasing, the concept of ’the’ IP address had become quite irrelevant anyhow.

Chapter 3. Introduction to iproute

It does show us the MAC addresses though, the hardware identifier of our ethernet interfaces.

3.4.2. ip shows us our IP addresses

[ahu@home ahu]$ ip address show 1: lo: mtu 3924 qdisc noqueue link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00: inet 127.0.0.1/8 brd 127.255.255.255 scope host lo 2: dummy: mtu 1500 qdisc noop link/ether 00:00:00:00:00:00 brd ff:ff:ff:ff:ff:ff 3: eth0: mtu 1400 qdisc pfifo_fast qlen 100 link/ether 48:54:e8:2a:47:16 brd ff:ff:ff:ff:ff:ff inet 10.0.0.1/8 brd 10.255.255.255 scope global eth 4: eth1: mtu 1500 qdisc pfifo_fast qlen 100 link/ether 00:e0:4c:39:24:78 brd ff:ff:ff:ff:ff:ff 3764: ppp0: mtu 1492 qdisc pfifo_fast qlen 10 link/ppp inet 212.64.94.251 peer 212.64.94.1/32 scope global ppp

This contains more information. It shows all our addresses, and to which cards they belong. ’inet’ stands

for Internet (IPv4). There are lots of other address families, but these don’t concern us right now.

Let’s examine eth0 somewhat closer. It says that it is related to the inet address ’10.0.0.1/8’. What does

this mean? The /8 stands for the number of bits that are in the Network Address. There are 32 bits, so we

have 24 bits left that are part of our network. The first 8 bits of 10.0.0.1 correspond to 10.0.0.0, our

Network Address, and our netmask is 255.0.0.0.

The other bits are connected to this interface, so 10.250.3.13 is directly available on eth0, as is 10.0.0.

for example.

With ppp0, the same concept goes, though the numbers are different. Its address is 212.64.94.251,

without a subnet mask. This means that we have a point-to-point connection and that every address, with

the exception of 212.64.94.251, is remote. There is more information, however. It tells us that on the

other side of the link there is, yet again, only one address, 212.64.94.1. The /32 tells us that there are no

’network bits’.

It is absolutely vital that you grasp these concepts. Refer to the documentation mentioned at the

beginning of this HOWTO if you have trouble.

You may also note ’qdisc’, which stands for Queueing Discipline. This will become vital later on.

3.4.3. ip shows us our routes

Well, we now know how to find 10.x.y.z addresses, and we are able to reach 212.64.94.1. This is not

enough however, so we need instructions on how to reach the world. The Internet is available via our ppp

Chapter 3. Introduction to iproute

[root@espa041 /home/src/iputils]# ip neigh show 9.3.76.42 dev eth0 lladdr 00:60:08:3f:e9:f9 nud reachable 9.3.76.1 dev eth0 lladdr 00:06:29:21:73:c8 nud reachable

As you can see my machine espa041 (9.3.76.41) knows where to find espa042 (9.3.76.42) and espagate

(9.3.76.1). Now let’s add another machine to the arp cache.

[root@espa041 /home/paulsch/.gnome-desktop]# ping -c 1 espa PING espa043.austin.ibm.com (9.3.76.43) from 9.3.76.41 : 56(84) bytes of data. 64 bytes from 9.3.76.43: icmp_seq=0 ttl=255 time=0.9 ms

--- espa043.austin.ibm.com ping statistics --- 1 packets transmitted, 1 packets received, 0% packet loss round-trip min/avg/max = 0.9/0.9/0.9 ms

[root@espa041 /home/src/iputils]# ip neigh show 9.3.76.43 dev eth0 lladdr 00:06:29:21:80:20 nud reachable 9.3.76.42 dev eth0 lladdr 00:60:08:3f:e9:f9 nud reachable 9.3.76.1 dev eth0 lladdr 00:06:29:21:73:c8 nud reachable

As a result of espa041 trying to contact espa043, espa043’s hardware address/location has now been

added to the arp/neighbor cache. So until the entry for espa043 times out (as a result of no

communication between the two) espa041 knows where to find espa043 and has no need to send an ARP

request.

Now let’s delete espa043 from our arp cache:

[root@espa041 /home/src/iputils]# ip neigh delete 9.3.76.43 dev eth [root@espa041 /home/src/iputils]# ip neigh show 9.3.76.43 dev eth0 nud failed 9.3.76.42 dev eth0 lladdr 00:60:08:3f:e9:f9 nud reachable 9.3.76.1 dev eth0 lladdr 00:06:29:21:73:c8 nud stale

Now espa041 has again forgotten where to find espa043 and will need to send another ARP request the

next time he needs to communicate with espa043. You can also see from the above output that espagate

(9.3.76.1) has been changed to the "stale" state. This means that the location shown is still valid, but it

will have to be confirmed at the first transaction to that machine.

Chapter 4. Rules - routing policy database

If you have a large router, you may well cater for the needs of different people, who should be served

differently. The routing policy database allows you to do this by having multiple sets of routing tables.

If you want to use this feature, make sure that your kernel is compiled with the "IP: advanced router" and

"IP: policy routing" features.

When the kernel needs to make a routing decision, it finds out which table needs to be consulted. By

default, there are three tables. The old ’route’ tool modifies the main and local tables, as does the ip tool

(by default).

The default rules:

[ahu@home ahu]$ ip rule list 0: from all lookup local 32766: from all lookup main 32767: from all lookup default

This lists the priority of all rules. We see that all rules apply to all packets (’from all’). We’ve seen the

’main’ table before, it is output by ip route ls , but the ’local’ and ’default’ table are new.

If we want to do fancy things, we generate rules which point to different tables which allow us to

override system wide routing rules.

For the exact semantics on what the kernel does when there are more matching rules, see Alexey’s

ip-cref documentation.

4.1. Simple source policy routing ....................................................................................................

Let’s take a real example once again, I have 2 (actually 3, about time I returned them) cable modems,

connected to a Linux NAT (’masquerading’) router. People living here pay me to use the Internet.

Suppose one of my house mates only visits hotmail and wants to pay less. This is fine with me, but

they’ll end up using the low-end cable modem.

The ’fast’ cable modem is known as 212.64.94.251 and is a PPP link to 212.64.94.1. The ’slow’ cable

modem is known by various ip addresses, 212.64.78.148 in this example and is a link to 195.96.98.253.

The local table:

[ahu@home ahu]$ ip route list table local broadcast 127.255.255.255 dev lo proto kernel scope link src 127.0.0.

Chapter 4. Rules - routing policy database

/ _ | if1 | /

/ \ | | |

| Local network -----+ Linux router | | Internet

_ __/ | | |

__ __/ | if2 | \

___/ +------+-------+ +------------+ |

| | | \

+-------------+ Provider 2 +-------

+------------+ ________

There are usually two questions given this setup.

4.2.1. Split access

The first is how to route answers to packets coming in over a particular provider, say Provider 1, back out

again over that same provider.

Let us first set some symbolical names. Let $IF1 be the name of the first interface (if1 in the picture

above) and $IF2 the name of the second interface. Then let $IP1 be the IP address associated with $IF

and $IP2 the IP address associated with $IF2. Next, let $P1 be the IP address of the gateway at Provider

1, and $P2 the IP address of the gateway at provider 2. Finally, let $P1_NET be the IP network $P1 is in,

and $P2_NET the IP network $P2 is in.

One creates two additional routing tables, say T1 and T2. These are added in /etc/iproute2/rt_tables.

Then you set up routing in these tables as follows:

ip route add $P1_NET dev $IF1 src $IP1 table T

ip route add default via $P1 table T

ip route add $P2_NET dev $IF2 src $IP2 table T

ip route add default via $P2 table T

Nothing spectacular, just build a route to the gateway and build a default route via that gateway, as you

would do in the case of a single upstream provider, but put the routes in a separate table per provider.

Note that the network route suffices, as it tells you how to find any host in that network, which includes

the gateway, as specified above.

Next you set up the main routing table. It is a good idea to route things to the direct neighbour through

the interface connected to that neighbour. Note the ‘src’ arguments, they make sure the right outgoing IP

address is chosen.

ip route add $P1_NET dev $IF1 src $IP

Chapter 4. Rules - routing policy database

ip route add $P2_NET dev $IF2 src $IP

Then, your preference for default route:

ip route add default via $P

Next, you set up the routing rules. These actually choose what routing table to route with. You want to

make sure that you route out a given interface if you already have the corresponding source address:

ip rule add from $IP1 table T

ip rule add from $IP2 table T

This set of commands makes sure all answers to traffic coming in on a particular interface get answered

from that interface.

Warning

Reader Rod Roark notes: ’If $P0_NET is the local network and $IF0 is its

interface, the following additional entries are desirable:

ip route add $P0_NET dev $IF0 table T

ip route add $P2_NET dev $IF2 table T

ip route add 127.0.0.0/8 dev lo table T

ip route add $P0_NET dev $IF0 table T

ip route add $P1_NET dev $IF1 table T

ip route add 127.0.0.0/8 dev lo table T

Now, this is just the very basic setup. It will work for all processes running on the router itself, and for

the local network, if it is masqueraded. If it is not, then you either have IP space from both providers or

you are going to want to masquerade to one of the two providers. In both cases you will want to add rules

selecting which provider to route out from based on the IP address of the machine in the local network.

4.2.2. Load balancing

The second question is how to balance traffic going out over the two providers. This is actually not hard

if you already have set up split access as above.

Instead of choosing one of the two providers as your default route, you now set up the default route to be

a multipath route. In the default kernel this will balance routes over the two providers. It is done as