PROACTIVE REBOOTING CUTS DOWN TIME TO SECONDS

Addressable Base Units discussed in this article are:

APB (Auto Push Board)

IPN (Intelligent AC power Node)

X2 (AB-Matrix Switch)

Controllers referenced:

SP-RRC (Telephone Line Touch-tone Controller)

N-RRC (TCP/IP Network Controller)

Servers, Routers, PCs, WiFi Access Points and more

As networks grow in size and complexity and more interdependent hardware is added, the potential for failure also grows. The added hardware is subject to all of the incompatibilities, software glitches, electrical disturbances, dirty power and other effects that cause them to occasionally hang and need to be reset.

Traditionally the reset has been accomplished by recycling the AC power. This traditional mindset is OK except that some hardware devices may have unique characteristics. As an example, PCs or servers with ATX motherboards may require the power button to be pushed after a power cycle or an inaccessible wireless AP with power over Ethernet (PoE) may only require a much simpler device, longer or shorter reset times as well as AC power stabilization times may also be desirable, etc.

Aside from these nuances, "yanking the power" was a relatively simple and inexpensive task when the hardware and a willing body were easily accessible. However, if this were not the case, long down times could be experienced. Some of more disbursed networks eventually evolved to remotely rebooting the network components, out of band, via the phone lines or over the network itself if it were available. This was normally a separate task, but could be integrated into a network management system.

Remote rebooting provided a vast improvement in up time and it saved a bundle in field trips to remote locations. However, there were still idiosyncrasies and gaps between a fault notification, network availability, the diagnostics and the "call to action". Ideally, when this need arises, you would like to have this all happen automatically, independent of the network and with very minimal down time (seconds or just a few minutes); rather than waiting for an e-mail, or your phone to ring, the network to come back up or the "guy who normally handles that" to be available.

Proactive Rebooting
Automating the rebooting task to achieve a proactive approach is the next step in reducing down time and cost. This is one of the objectives of the CPS product line. By combining addressable "Base Units" that are designed for specific rebooting tasks with Automatic software utilities operating locally at remote sites, the customer can achieve unparalleled levels of up time and at a reasonable cost. However, out-of-band access via a phone line or the network to the remote site can also be maintained, if desired, with the addition of the appropriate remote controller or software. See Base Units and Controllers.

All of the addressable base units have RS-232 serial ports which receive the automated commands from a "Control PC" or server (yours) that runs the automated software at the remote site, closer to the action and independent of the WAN. They also have the Control Ports that can receive remote commands through the controller if available. Note that if the network is available, the "Control PC" can also receive direct telnet ON/OFF serial port commands over the network using the "Serial Net" software utility. This could be used as a stand alone function, or integrated into network management software.

This provides several optional methods of controlling the Base Units. However, the two automatic software utilities that make the proactive automated approach feasible are the "Heartbeat" and the "Auto Ping Rebooter".

Heartbeat
The Heartbeat is a self monitoring system. It is a windows base software utility that normally runs on the target machine itself. It sends a heartbeat through its serial port where the Base Unit is connected. If the system or your linked application (see serial DLL) hang, it recycles the AC power or pushes the rest button depending on the Base Unit selected. A standard version that operates out of the System Tray on any Windows 95 or greater system is available as well as a version that can run as a service on NT or greater machines. It also has e-mail notification.

Normally the I-APB-H (internal Auto Push Board) base unit is used in this application. It will push the reset or power buttons or both depending on how it is installed; although the ARR or IPN (external AC power units) may also be used. However, the APB is less expensive and it will also push the power button again, if necessary, after power is restored and stabilized, if configured to do so.

Auto Ping Program
The Auto Ping program is intended to monitor the surrounding network components locally at a remote site for the purpose of rebooting them; as opposed to pinging them remotely from a central site (network dependent). It normally runs as a service on a nearby PC or server (up to 1500'), designated as the "Control PC". It uses the addressable feature of the Base Units. It pings the designated IP addresses. If the return ping fails, a reboot command is sent through the serial port of the "Control PC" to the Base Unit corresponding to that IP address. To reset routers, or any device that cannot be directly pinged, you would set the software to ping different address through the router to determine if the router or the pinged address were at fault.

The addressable Base Units are "daisy chained" using standard phone cable and RJ-11 jacks. They could all be the same type of Base Units or mixed and matched to the types of network components to be reset. As an example, an IPN (Intelligent AC Power Node) could be used to reset power to a router or any type of device for that matter. The APB (Auto Push Board) is normally used on PC based equipment and the X2 version of the ABM switch is normally used to reset wireless access points that use PoE. All of these devices understand the same ON/OFF/RESET commands.

Possibilities
A number of variations may be developed from this concept. As an example if there were two adjacent Kiosks or servers, each could ping the other thereby keeping each other awake. Each would have a Base Unit controlled by the other machine's serial port. This "pairs" approach could work for any number of machines. You might also wake up a dead machine with an unrelated machine or over a network segment that can be reached (no controller require). A firewall, or Windows-based device, with a serial port may be used as the "control PC". The Heartbeat may also be installed on the "control PC" for a totally self rebooting automated system. Two serial ports or a USB/serial cable would be required. Other software is available that allow remote access to the serial port of the "control PC" over the network. See the following page of diagrams.

Adding a Remote Controller
As previously stated, a remote controller could also be added to control the daisy chain remotely out of band over a phone line (SP-RRC) or over the network (N-RRC). This requirement may not be as great when utilizing the automated approach as it would be when only remote control is used. However, the approaches are not mutually exclusive.

The controllers also provide secure access and direct ON/OFF commands well as other programmable features. The Telco line controller can also share a phone line with other normal voice fax or modem activity thereby saving an extra phone line for what may be an infrequently used activity.

The controllers connect to the "Control Port" on each base unit with standard telephone cable and RJ-11 connections similar to the "daisy chain". Note that the Controllers can also be used with non addressable Base Units on a one-to-one port basis.

 

Possible Automatic Hardware/Software Rebooting Configurations

Proactive 1 Diagram

Pinging through a router to reset the router using an external IPN ($115.00). Multiple terminals can be pinged to determine if the router or the terminals were at fault on a failed return ping.

Proactive 2 Diagram

Adding a PC with an internal I-APB"Auto Push Board" ($69.00) to reset a failed PC. Also begins a "daisy chain" (upstream/downstream RJ-11 connectors).

Proactive 3 Diagram

Adding an ABM-X2 ($140.00) to reset Wi Fi access points with PoE. Each ABM can control two lines (2 pr. RJ-45 in/out connectors).

Proactive 4 Diagram

Adding the "heartbeat" feature to the "Control PC" to reset it externally if it fails. This is also normally used with kiosks or other installations where they may be only one machine. I-APB-H ($79.00)

Proactive 5 Diagram

How the "Control PC" can be accessed remotely over the network to send TELNET type messages to the remote serial port. Also provides direct ON and OFF control in addition to the reset. Serial Net (FREE)

Proactive 6 Diagram

Adding a CPS CONTROLLER to directly access the remote "daisy chain". Use the N-RRC ($169.00) for access over the network or the SP-RRC ($135.00) for access over the telephone lines. Also provides direct ON and OFF control in addition to the reset as well as secure access.

Proactive 7 Diagram

How pairs of machines with I-APB boards or external IPNs can ping each other to keep the other awake. The serial ports connect to the alternate I-APB boards or IPNs. For an odd number of machines, two serial ports would be required in one machine.