bgp code

snaproute Go BGP Code Dive (14): First Steps in Processing an Update

In the last post on this topic, we found the tail of the update chain. The actual event appears to be processed here—

case BGPEventUpdateMsg:
  st.fsm.StartHoldTimer()
  bgpMsg := data.(*packet.BGPMessage)
  st.fsm.ProcessUpdateMessage(bgpMsg)

—which is found around line 734 of fsm.go. The second line of code in this snippet is interesting; it’s a little difficult to understand what it’s actually doing. There are three crucial elements to figuring out what is going on here—

:=, in go, is a way of assigning information to a data structure. But what, precisely, is being assigned to bgpMsg from the data structure?

The * (asterisk) is a way to reference a pointer within a structure. We’ve not talked about pointers before, so it’s worth spending just a moment with them. The illustration below will help a bit.

Each letter in the string “this is a string” is stored in a single memory location (this isn’t necessarily true, but let’s assume it is for this example). Further, each memory location has a location identifier, or rather some form of number that says, “this is memory location x.” This memory locator is, of course a number—hence the memory locator itself can be assigned to a variable, which can then be treated as a separate object from the string itself.

This memory locator is called a pointer.

It should only make sense that the locator is called a pointer, because it points to the string. The question that should pop up in your head right now is—”but wait, if each letter is stored in a different memory location, then which memory location does the pointer actually point to?” If you’re trying to describe the entire string, the pointer would normally point to the first character in the string. You can, of course, also describe just some part of the string by pointing to a memory location that’s someplace in the middle of the string. For instance, you could point to just the part of the string “is a string” by finding the memory location of the second “i” in the string, and storing its memory location.

How can you find the location of a string, or some other data structure? You place an & (ampersand) in front of it. So, if you do this—

my-pointer = &a-string

Now I have the pointer, but how do I get back to the value from the pointer? Like this—

a-string-copy = *my-pointer

So the * takes the data that is pointed at by the pointer and pulls it out for assignment to another variable. In this case, then, this line of code—

bgpMsg := data.(*packet.BGPMessage)

—is—

  • taking the data located at packet.BGPMessage
  • assigning it to the data structure bgpMsg

In other words, this is copying the actual packet contents out of the buffer into which they were copied by the BGP FSM when the packet was received, and into another structure where they can be processed as an update. We need to look elsewhere for the code that removes messages from this data structure—we will most likely find it when we start looking through ProcessUpdateMessage, which is where we will start next time.

snaproute Go BGP Code Dive (12): Moving to Established

In last week’s post, the new BGP peer we’re tracing through the snaproute BGP code moved from open to openconfirmed by receiving, and processing, the open message. In processing the open message, the list of AFIs this peer will support was built, the hold timer set, and the hold timer started. The next step is to move to established. RFC 4271, around page 70, describes the process as—

If the local system receives a KEEPALIVE message (KeepAliveMsg (Event 26)), the local system:
 - restarts the HoldTimer and
 - changes its state to Established.

In response to any other event (Events 9, 12-13, 20, 27-28), the local system:
 - sends a NOTIFICATION with a code of Finite State Machine Error,
 - sets the ConnectRetryTimer to zero,
 - releases all BGP resources,
 - drops the TCP connection,
 - increments the ConnectRetryCounter by 1,
 - (optionally) performs peer oscillation damping if the DampPeerOscillations attribute is set to TRUE, and
 - changes its state to Idle.

 

For a bit of review (because this is running so long, you might forget how the state machine works), the way the snaproute code is written is as a state machine. The way the state machine works is there are a series of steps the BGP peer must go through, each step being represented by a function call in the fsm.go file. As the peer moves from one state to another, a function call “moves the pointer” from the current state to the next one, such that any event which occurs will call a different function, based on the current state. I know this is rather difficult to follow, but what this means, in practical terms, is that if the underlying TCP session is acknowledged or confirmed while the peer is in connected state, the following code from around line 272 in fsm.go are executed—

case BGPEventTcpCrAcked, BGPEventTcpConnConfirmed:
 st.fsm.StopConnectRetryTimer()
 st.fsm.SetPeerConn(data)
 st.fsm.sendOpenMessage()
 st.fsm.SetHoldTime(st.fsm.neighborConf.RunningConf.HoldTime,
  st.fsm.neighborConf.RunningConf.KeepaliveTime)
 st.fsm.StartHoldTimer()
 st.BaseState.fsm.ChangeState(NewOpenSentState(st.BaseState.fsm))

However, if this same event occurs—an open acknowledgement for the underlying TCP session is received—while the peer is in openconfirm state, a different set of code is executed, from around line 593 in fsm.go

case BGPEventTcpCrAcked, BGPEventTcpConnConfirmed:
 st.fsm.HandleAnotherConnection(data)

This is a general characteristic of any FSM—the event is matched against the current state to determine what action to take next. With all of this in mind, any event received while the peer is in openconfirm state will be processed by func (st *OpenConfirmState) processEvent, which is around line 558 is fsm.go. This code consists of a switch statement, which looks like this—

func (st *OpenConfirmState) processEvent(event BGPFSMEvent, data interface{}) {
 switch event {
  case BGPEventManualStop:
   ....
  case BGPEventAutoStop:
   ....
  case BGPEventHoldTimerExp:
   ....
  case BGPEventKeepAliveTimerExp:
   ....
  case BGPEventTcpConnValid: // Supported later
  case BGPEventTcpCrAcked, BGPEventTcpConnConfirmed: // Collision Detection... needs work
   ....
  case BGPEventTcpConnFails, BGPEventNotifMsg:
   ....
  case BGPEventBGPOpen: // Collision Detection... needs work
  case BGPEventHeaderErr, BGPEventOpenMsgErr:
   ....
  case BGPEventOpenCollisionDump:
   ....
  case BGPEventNotifMsgVerErr:
   ....
  case BGPEventKeepAliveMsg:
   .... 
  case BGPEventConnRetryTimerExp, BGPEventDelayOpenTimerExp, BGPEventIdleHoldTimerExp,
   ....
  }
}

 

I’ve cut out the actions taken in each case to make it easier to see the structure of the entire switch statement in one sweep. Most of these options are actually error conditions that take exactly the same steps. Let’s look at one to see what it does—

case BGPEventHoldTimerExp:
 st.fsm.SendNotificationMessage(packet.BGPHoldTimerExpired, 0, nil)
 st.fsm.StopConnectRetryTimer()
 st.fsm.ClearPeerConn()
 st.fsm.StopConnToPeer()
 st.fsm.IncrConnectRetryCounter()
 st.fsm.ChangeState(NewIdleState(st.fsm))

 

If the hold timer expires while the peer is in openconfirmed state—

  • A notification is sent by SendNotificationMessage; this will tell the peer that the session is being torn down, so the two speakers can have synchronized state
  • The connect retry timer is stopped, so the local BGP speaker will not try to reconnect until the peer has passed through the idle state; this prevents any problems that might result from stepping outside the BGP state machine
  • The peer connection is cleared; the just empties the various data structures associated with the peer, so old information isn’t carried into a new peering session
  • The peering connection is stopped by StopConnToPeer
  • The connection retry counter is incremented, which allows the operator to see how many times this peer has been torn down and restarted
  • The state of the peer is changed to idel

This set of actions only changes slightly from state to state; if you search for this set of steps, you’re likely to find it at least a few dozen times throughout fsm.go.

There is one other interesting point about this code worth mentioning. The folks at snaproute apparently haven’t implemented peer collision detection, as evidenced by the comments in the code itself. For instance—


  case BGPEventTcpConnValid: // Supported later
  case BGPEventTcpCrAcked, BGPEventTcpConnConfirmed: // Collision Detection... needs work
   ....
  case BGPEventTcpConnFails, BGPEventNotifMsg:
   ....
  case BGPEventBGPOpen: // Collision Detection... needs work

Each of these three events—receiving a new TCP connection towards a peer that is already in openconfirmed state, or receiving an open message from a peer that is already in openconfirmed state— represents an event that should not take place. What should the snaproute code do here? According to section 6.8 of RFC4271, it should—

Unless allowed via configuration, a connection collision with an existing BGP connection that is in the Established state causes closing of the newly created connection.

So when they eventually fill this bit of code in, you can be pretty certain what the actual code will do—it will reset the peering session in a way that’s similar to the other error code already present. The bit of code that’s interesting in the context of moving from openconfirmed to established are around line 627 in fsm.go

case BGPEventKeepAliveMsg:
 st.fsm.StartHoldTimer()
 st.fsm.ChangeState(NewEstablishedState(st.fsm))

 

The actual processing to move from openconfirmed to established is simple: if the local peer receives a keep alive message while in the openconfirmed state, move the peer to established.

As we’ve reached established state, the next step is to understand how updates are received and processed for this new peer.

snaproute Go BGP Code Dive (10): Moving to Open Confirm

In the last post on this topic, we traced how snaproute’s BGP code moved to the open state. At the end of that post, the speaker encodes an open message using packet, _ := bgpOpenMsg.Encode(), and then sends it. What we should be expecting next is for an open message from the new peer to be received and processed. Receiving this open message will be an event, so what we’re going to need to look for is someplace in the code that processes the receipt of an open message. All the way back in the fifth post of this series, we actually unraveled this chain, and found this is the call chain we’re looking for—

  • func (st *OpenSentState) processEvent()
  • st.fsm.StopConnectRetryTimer()
  • bgpMsg := data.(*packet.BGPMessage)
  • if st.fsm.ProcessOpenMessage(bgpMsg) {
    • st.fsm.sendKeepAliveMessage()
    • st.fsm.StartHoldTimer()
    • st.fsm.ChangeState(NewOpenConfirmState(st.fsm)) }

I don’t want to retrace all those steps here, but the call to func (st *OpenSentState) processEvent() (around line 444 in fsm.go) looks correct. The call in question must be a call to a function that processes an event while the peer is in the open state. This call seems to satisfy both requirements. There is a large switch statement in this function; let’s see if we can sort out what a few of these do to get a general sense of what is in this switch.

  • case BGPEventManualStop: this covers the case where the operator manually deconfigures or otherwise stops the BGP process, or the formation of this specific peer
  • case BGPEventAutoStop: this covers the case where the BGP process is brought down for some automatically generated reason; for instance, this (probably) covers the case where the BGP process is shut down because the system itself is going down
  • case BGPEventHoldTimerExp: when the peer was moved into the open state, the hold timer was configured and started running; if the hold timer expires before an open message is received from the peer, then a notification is sent and the peer is pushed back to idle state
  • case BGPEventTcpConnFails: if the TCP socket reports that the connection has failed, the peer is cleared and set back to active state

The particular bit of code in this switch we’re interested in is—

case BGPEventBGPOpen:
  st.fsm.StopConnectRetryTimer()
  bgpMsg := data.(*packet.BGPMessage)
  if st.fsm.ProcessOpenMessage(bgpMsg) {
    st.fsm.sendKeepAliveMessage()
    st.fsm.StartHoldTimer()
    st.fsm.ChangeState(NewOpenConfirmState(st.fsm))
  }

Well, this doesn’t look so bad, right? Just a few short lines of code. 🙂

st.fsm.StopConnectRetryTimer() is pretty obvious, so I won’t spend a lot of time here. The peer is now connected, so there’s no reason to keep running the timer that causes events when the timer expires.

bgpMsg := data.(*packet.BGPMessage) might not be so obvious at first. In order to reach this state, the local peer has received a packet of some type. The contents of that packet must somehow be processed to actually form the peering relationship. This line of code just creates a new variable called bgpMsg and assigns the received packet to this variable. The := operator is specific to go, so it’s probably worth pausing for a second to explain.

Typing is a method a programming language uses to control memory usage, catch errors in the code during the compilation process, etc. If you define a new variable that is supposed to hold a whole number, or a number without a floating point component (the fractional part after the decimal point), and assign it the value 2, you might do something like this in C—

int a-number;
a-number = 2;

go does things a little differently, placing the name of the variable before the type, like this—

var a-number int
a-number = 2

The first line is consider the variable declaration, while the second is the variable assignment. These are normally two separate steps. But in go, there is a shortcut to this process. You can declare the variable and assign a value in one step, like this—

a-number := 2

How does the compiler know what kind or type of variable a-number is? By looking at the value assigned. In this case, the coder has declared a variable called bgpMsg, and assigned it the value of the contents of the open message just received in one step.

Next time, we’ll look at how this information is actually process. ’til then, happy coding.

snaproute Go BGP Code Dive (8): Moving to Open

Last week we left off with our BGP peer in connect state after looking through what this code, around line 261 of fsm.go in snaproute’s Go BGP implementation—

func (st *ConnectState) processEvent(event BGPFSMEvent, data interface{}) {
  switch event {
  ....
    case BGPEventConnRetryTimerExp:
      st.fsm.StopConnToPeer()
      st.fsm.StartConnectRetryTimer()
      st.fsm.InitiateConnToPeer()
....

What we want to do this week is pick up our BGP peering process, and figure out what the code does next. In this particular case, the next step in the process is fairly simple to find, because it’s just another case in the switch statement in (st *ConnectState) processEvent

case BGPEventTcpCrAcked, BGPEventTcpConnConfirmed:
  st.fsm.StopConnectRetryTimer()
  st.fsm.SetPeerConn(data)
  st.fsm.sendOpenMessage()
  st.fsm.SetHoldTime(st.fsm.neighborConf.RunningConf.HoldTime,
    st.fsm.neighborConf.RunningConf.KeepaliveTime)
  st.fsm.StartHoldTimer()
  st.BaseState.fsm.ChangeState(NewOpenSentState(st.BaseState.fsm))
....

This looks like the right place—we’re looking at events that occur while in the connect state, and the result seems to be sending an open message. Before we move down this path, however, I’d like to be certain I’m chasing the right call chain, or logical thread. How can I do this? This code is called when (st *ConnectState) processEvent is called with an event called BGPEventTcpCrAcked or BGPEventTcpConnConfirmed. Let’s chase down where these events might come from to see if this really is the next step in the call chain we’re trying to chase.

Note: Sometimes it’s easier to chase from the end result back towards the caller, and sometimes it’s not. There’s no way to know which is which until you have more experience in chasing through code. It takes time and practice to build these sorts of skills up, just like many other skills—but in chasing through code, you’re not only learning the protocols better, you’re also learning how to code better.

To find what we’re looking for, we can search through the project files for some instance of BGPEventTcpCrAcked, which seems to be the result of receiving an ACK for a TCP session initiated by BGP. We find a few places in fsm.go, as always, but most of them are using the event, rather than causing (or throwing) it—

272: case BGPEventTcpCrAcked, BGPEventTcpConnConfirmed:
371: case BGPEventTcpCrAcked, BGPEventTcpConnConfirmed:
475: case BGPEventTcpCrAcked, BGPEventTcpConnConfirmed:
592: case BGPEventTcpCrAcked, BGPEventTcpConnConfirmed:
709: case BGPEventTcpCrAcked, BGPEventTcpConnConfirmed:

Until we get to this one—

case inConnCh := 

What does this do? This is a little complex, but let’s try to work through it. When starting a new peer, a port was cloned on which to send TCP packets to the peer. Since the port is cloned to a port the main FSM function is watching—(fsm *FSM) StartFSM()—the main FSM function is going to be notified of any inbound TCP packets received on the local device. When one specific sort of packet is received, an acknowledgement in a new TCP session, the main FSM function is called, resulting in case inConnCh := <-fsm.inConnCh: being called. This, in turn, calls (st *ConnectState) processEvent with BGPEventTcpCrAcked.

If you followed that, you know this verifies what it looked like in the first place—the code above is, in fact, the correct code to process the next phase of peering. The call chain looks something like this—

  • (fsm *FSM) StartFSM() is watching the TCP ports for any new packets
  • When (fsm *FSM) StartFSM() recieves a new TCP ACK, it falls through to case inConnCh := <-fsm.inConnCh: in the switch statement
  • This, in turn, calls (st *ConnectState) processEvent with BGPEventTcpCrAcked
  • (st *ConnectState) processEvent falls through to the case statement case BGPEventTcpCrAcked, BGPEventTcpConnConfirmed, which then calls the correct functions to move beyond connect state

It’s okay if you have to read all of that several times—FSMs (Finite State Machines—remember?) can be very difficult to follow. This means we need to chase down each of these functions to find out how this implementation of BGP actually moves beyond the open state—

  • st.fsm.StopConnectRetryTimer()
  • st.fsm.SetPeerConn(data)
  • st.fsm.sendOpenMessage()
  • st.fsm.SetHoldTime(st.fsm.neighborConf.RunningConf.HoldTime, st.fsm.neighborConf.RunningConf.KeepaliveTime)
  • st.fsm.StartHoldTimer()
  • st.BaseState.fsm.ChangeState(NewOpenSentState(st.BaseState.fsm))

It’s pretty obvious what StopConnectRetryTimer does—it stops BGP from continuing to try to connect to this peer. Since the peer has acknowledged the initial TCP packet, we shouldn’t keep trying to send it initial TCP packets. SetPeerConn is a bit harder—

func (fsm *FSM) SetPeerConn(data interface{}) {
  if fsm.peerConn != nil {
    return
  }
  pConnDir := data.(PeerConnDir)
  fsm.peerConn = NewPeerConn(fsm, pConnDir.connDir, pConnDir.conn)
  go fsm.peerConn.StartReading()
}

This just does some general logging (which I’ve removed for clarity), and then tells the main process (through the FSM call) to start reading packets off this new peer’s data structure. I’m not going to dive into these functions deeply here.

Next time, we’ll look at the four remaining functions, as these are where the action really is from a BGP perspective.

snaproute Go BGP Code Dive (4): Starting a Peer

In the last three episodes of this series, we discussed getting a copy of SnapRoute’s BGP code using Git, we looked at the basic structure of the project, and then we did some general housekeeping. At this point, I’m going to assume you have the tools you need installed, and you’re ready to follow along as we ask the code questions about how BGP actually works.

Now, let’s start with a simple question: how does BGP bring a new peer up?

It seems like we should be able to look for some file that’s named something with peering in it, but, looking at the files in the project, there doesn’t seem to be any such thing (click to show a larger version of the image below if you can’t read it).

ls-go-bgp

Hmmm… Given it’s not obvious where to start, what do we do? There are a number of options, of course, but three options stand out as the easiest.

First, you can just poke around the code for a bit to see if you find anything that looks like it might be what you’re looking for. This is not, generally, for the faint of heart. Over time, as you become more familiar with the way coders tend to structure things, this method will work more often than not, but for now, let’s look for an easier way to find a starting point.

Second, you could compile the code, setting breakpoints in the TCP code just before TCP sends packets off to the correct process for handling, fire a BGP packet at the box using a simulator, and see where the code actually takes you. This also isn’t for the faint of heart, so let’s see if we can think of something simpler.

Third, you can do it the way I normally do. Run the code as a process and turn on debugging, generally in an emulator so you can connect peers, etc. Now, connect a peer, and capture a few debug messages. Shut everything down and return to your code directory, then search the code for one or more of the debug messages you just captured. This should lead you to the code you’re looking for. In this case, I run into a message that looks something like this—

Neighbor: x.x.x.x FSM xx ConnEstablished - start

There is one word in here that is pretty odd—ConnEstablished—and hence will probably yield results if I do a search for it. I’m going to resort to Atom here, as I don’t want to get into grep on the command like, but doing a search across the entire project shows me (once again, click for a larger version)—

go-bgp-conn-est

Hmmm… The most interesting of these is the one where the message is actually printed on the console, which is line 1319 in fsm.go. Popping into this file, I find the following—

func (fsm *FSM) ConnEstablished() {
  fsm.logger.Info(fmt.Sprintln("Neighbor:", fsm.pConf.NeighborAddress, "FSM", fsm.id, "ConnEstablished - start"))
  fsm.Manager.fsmEstablished(fsm.id, fsm.peerConn.conn)
  fsm.logger.Info(fmt.Sprintln("Neighbor:", fsm.pConf.NeighborAddress, "FSM", fsm.id, "ConnEstablished - end"))
}

Now if I find where this function is called, I can find out where, in the code, neighbors actually actually established. This process of tracing back from the end point to figure out what’s actually happening can be a bit tedious, but until you’re more familiar with the basic structure of the code, it’s often the only choice you’re going to have.

As it turns out, the name of the function we need to find is ConnEstablished. We can repeat our original search to find out where this function is actually called (see the image above, as it’s the same search). There is only one call to this function, found in the same file—

func (fsm *FSM) ChangeState(newState BaseStateIface) {
  ....
  } else if oldState != BGPFSMEstablished && fsm.State.state() == BGPFSMEstablished {
    fsm.ConnEstablished()
  }
}

You might notice there are a number of calls to PeerConnEstablished, as well—and we would simplify our search by jumping directly to that call—but for the moment let’s take the long way around by tracing back one step at a time.

Looking at the code, we find there are a number of calls to ChangeState, but the one that’s interesting is here—

st.fsm.ChangeState(NewEstablishedState(st.fsm))

Which is around line 611 in fsm.go, for those who are trying to follow along. This particular call is interesting among all the other calls because it is the only one that mentions the state we’re looking for, established state. We can figure out where to look next by going to the top of the function in which this line of code is called, which is—

func (st *OpenConfirmState) processEvent(event BGPFSMEvent, data interface{}) {
  st.logger.Info(fmt.Sprintln("Neighbor:", st.fsm.pConf.NeighborAddress, "FSM:", st.fsm.id,
    "State: OpenConfirm Event:", BGPEventTypeToStr
[event])) ....

Now we’ve run into something odd—the function name is literally processEvent. This seems a little generic. In fact, if we search the code for processEvent, we’re going to find hundreds of instances of this function call. It looks like we’re lost in the weeds, doesn’t it? Not necessarily…

If you’ll notice, just before the function name, there’s a set of parenthesis with (st *OpenConfirmState). This is, in fact, what I would call in C a call by reference, something that’s rather common in building a finite state machine like this in code. Let me explain…

A finite state machine is normally a flow chart that shows each possible state the system can be in, how it can enter that state, and how it can exit the state. Sometimes this FSM is represented in text form, where the state is listed, possible inputs are listed, and the resulting state is given for each possible input in this particular state. Forinstance, the BGP specification contains such an FSM, as shown below—

8.1.4.  TCP Connection-Based Events

Event 14: TcpConnection_Valid

Definition: Event indicating the local system reception of a TCP connection request with a valid source IP address, TCP port, destination IP address, and TCP Port.  The definition of invalid source and invalid IP address is determined by the implementation.

BGP's destination port SHOULD be port 179, as defined by IANA.

TCP connection request is denoted by the local system receiving a TCP SYN.

Status: Optional

Optional Attribute Status:

1) The TrackTcpState attribute SHOULD be set to TRUE if this event occurs.

Event 15: Tcp_CR_Invalid

Definition: Event indicating the local system reception of a TCP connection request with either an invalid source address or port number, or an invalid destination address or port number.

When we run into something like processEvent in a file called fsm, we’re probably looking at a finite state machine broken up into a set of functions, each of which represent a single state, and each of which perform the right actions to move from the current state to a new state in the FSM. I know this is difficult to grock, so let me give you a more visual representation.

bgp-cd-fsm

State A is where we begin… This state would be represented as a single function in the source code. When State A is reached, this function is called, and, depending on the input, the function for State A will either call State B’s function, or State C’s function. This chain of events will continue until the final state is reached, and the FSM either enters a steady state, or exits. What tends to be confusing about this process is that these functions might not, in fact, call one another. Instead, what generally happens is the function for State A will be called, which will result in State C being the new state. The program will exit and wait for another event. When this next event occurs, the application will send this new event to the function for the current state, State C, which will then process the event, leaving the process in state D, for instance. This process/move to a new state/exit/wait cycle happens until the state reaches a steady state, or until the process ends.

Instead of calling each function a different name, this code is built with the same function name in each state structure. Each state is represented by a structure, and each structure has a function that is called when an event happens while the FSM is in that particular state. If you’re in state A and event occurs, you call (*stateA) processEvent. If you’re in state B and an event occurs, you call (*stateB) processEvent. There is one structure for each state, and a single function to handle events while in that state.

This means we’re not going to be able to just jump back function by function to trace what happens. Instead of tracing the functions, we’re going to need to trace the state by looking at the function within each state that deals with events. Lucky for us, the current state is contained right there in the function call—(st *OpenConfirmState). What we’re going to need to do, then, is trace back the successive states by looking at how we get to OpenConfirmState, and then how we get to the state that gets us to OpenConfirmState, etc. Along the way, we’re going to see precisely how a new peer is brought up in this version of BGP. We’ll start tracing these states next time.

snaproute Go BGP Code Dive (3)

This week, I want to do a little more housekeeping before we get into actually asking questions of the bgp code. First there is the little matter of an editor. I use two different editors most of the time, Notepad++ and Atom.

  • Notepad++ is a lightweight, general purpose text editor that I use for a lot of different things, from writing things in XML, HTML, CSS, Python Javascript, C, and just about anything else. This is an open source project hosted on the Notepad++ web site with the code hosted at github.
  • Atom is a more GUI oriented “programmer’s editor.” This is a more full featured editor, going beyond basic syntax highlighting into projects, plugins that pull diffs in side by side windows, and the like. I don’t really have a build environment set up right now, so I don’t know how it would interact with compiled code, but I assume it would probably have a lot of the tricks I’m used to, like being able to trace calls through the code, etc. Atom is available here.

I haven’t actually chosen one or the other—I tend to use both pretty interchangeably, so you’re likely to see screen shots from both/either as I move through various bits of the code. There is a second bit of “housekeeping,” I wanted to point out up front how project files are usually structured. This can be a little confusing to folks who haven’t worked on large projects, so this might be helpful.

Code is built from front, or top, of the file to the end, or the back, of the file. To understand, assume I’m going to build a small program that either adds or multiplies two numbers, based on the operator in the arguments. Suppose I want to be able to extend it later, and I don’t like switch statements, so I decide to implement it as three different functions, like this—

int operate(int num1, int num2, int operator) {
if operator == 1 return add(num1, num2);
if operator == 2 return multiply(num1, num2);
}

int add(int num1, int num2) {
return num1 + num2;
}

int multiply(int num1, int num2) {
return num1 * num2;
}

I’ve not dealt with overflows, floating points, and all the rest here—this is just to illustrate a simple principle (I’d normally #define the operators, or build an array and call straight through based on a pointer to the actual operator function—but this isn’t elegant, this is simple). If I actually tried to compile this bit of code, I’d get an error on the line—

if operator == 1 return add(num1, num2);

—telling me “add” isn’t defined. There are two ways I can solve this. The first would be to declare the function someplace. Normally I’d do this in a file that’s included in other files, but with something this simple I might just put a function declaration at the top of the file and leave it at that. The problem with either of these methods is that I must remember to change the declaration if I happen to change the function itself. For instance, if I modify “add” to accept floating point numbers, I need to not only change the function, but the declaration of the function, as well.

Instead of declaring things in this way, what is normally done is to build “helper” and “basic” functions first, then to call them in more complex functions later on in the same file. If you’re going to break up a single project into multiple files, of course, you still have to make certain you declare things you’ve included in one file, and want to use in another. But in many cases, an entire module is only going to have one or two functions that can be called from other files—the majority of the functions are going to be “hidden,” in that they’re only used in the single file in which they’re defined and coded up.

I know that’s a big chunk of text there, but to give an example, the simplest thing here is to do this:

int add(int num1, int num2) {
return num1 + num2;
}

int multiply(int num1, int num2) {
return num1 * num2;
}

int operate(int num1, int num2, int operator) {
if operator == 1 return add(num1, num2);
if operator == 2 return multiply(num1, num2);
}

Now by the time operate is compiled, the functions it relies on have already been compiled—the compiler knows where to find them, and what they actually do. To complete the example, this makes perfect sense if the only function in this file I ever intend to call from anyplace else is operate. I can declare operate in a file that’s included in other files, and just leave add and multiply as local declarations, only usable from within the file operate is defined.

As a practical matter, this means the big chunks of code are going to be at the bottom of any given file. If you’re looking, for instance, for how BGP processes a particular packet in the BGP code, you’re going to find the answer at the bottom of the file containing the function that processes that particular packet. To understand how the function operates, you’re going to need to “trace up” the file, examining each of the “helper functions.”

But when you’re reading code, you need to start at the bottom of the file, not the top.