Handling Geospatial Data in a Digital Age

Handling Geospatial Data in a Digital Age
December 12, 2007 Mike Tully

This article was contributed by guest author: Tyler Coan

The geospatial industry has always changed and embraced technology, but with the emergence of the fully digital work environment, your organization’s computer systems are key to how quickly and securely your critical geospatial data is transferred and stored. Is your system up to the demands of your data?

The ideal computing environment is going to depend on your specific needs and budget. Typically, drive space is critical due to voluminous amounts of large imagery files; computer workstation speed is necessary to process quality images in a timely manner; and network speed is important to move files efficiently. Further, it is very important to create redundant processes and backup systems to prevent loss of work.

Drive Space & Storage

The most important component in a geospatial environment is drive space. The more space you have, the more data you can store and immediately access. It is best to keep your data centralized in a client/server environment. This allows for easy access and sharing of the files by anyone logged into your network.

Computer Workstations

Workstation speed is important when processing images or data. A fast processor and sufficient RAM can help your production staff create solutions efficiently. Hard drive capacity may also be a factor depending on the data and the type of processing you are attempting. Some data processes take gigabytes of RAM, while others require less. It is important to quantify how much you need.

Before buying a computer, it is a good idea to think about how it will be used over the long term. If you’re editing 3D images, it’s important to have a graphics card and monitors that support it. If you need a computer that processes large amounts of data, it’s may be more important to have a faster processor, more RAM, and a weaker graphics card. If the machine is only an internet and e-mail station, a less expensive option is acceptable.

As technology is constantly improving, a great strategy is to have a computer replacement plan. A master plan defines when a top of the line machine purchased for a power user, can be re-tasked for a less demanding user who might need it for checking e-mail and surfing the internet.

Network Speed

Network speed is important for moving imagery and data across the network from workstation to workstation. Fiber cabling is currently the most expensive option, but offers the fastest connection. The next best option is the more common copper cable. Copper cabling is broken down into Category 1-6 ratings. The higher the category number, the faster speeds the cable provides. Caution is warranted when choosing which cables and equipment to procure. If Category 6 cables run throughout your building, but some network component such as a switch or punch panel is less than Category 6, your effective network speeds will be choked to less than Category 6 speeds. Today, nothing but Category 6 or fiber cables should be used in an installation, although Category 5 cabling is by far the most common in use today.

Redundancy & Backup

Redundancy is a way to keep your network safe from accidents and disasters. Hard drives will break, servers will die, and thieves may steal or destroy. With good planning, data can be easily recovered from any disaster with a good redundancy and backup plan.

Hard drive redundancy should be implemented through an array of independent disks (RAID). RAID is popular because it can use multiple hard drives to provide protection from any single hard drive failure and has the added benefit of increasing performance. Although there are many different kinds of RAID configurations, the more common implementations are:

  • RAID 0 is a “striped” set without parity. A striped set will write the data across multiple disks (minimum of 2 disks required), thus increasing read and write times. However it provides no means for data recovery when a hard drive failure occurs.
  • RAID 1 is a “mirrored” set, meaning the RAID will create two identical disks. This redundancy is commonly provided by using separate disk controllers (“duplexing”). Although RAID 1 provides little performance boost, it does offer important data redundancy.
  • RAID 5 5 is probably the most common solution in a network environment. RAID 5 is a striped set with distributed parity and requires three or more drives. RAID 5 will stripe data and parity information across the drives, just like RAID 0. This RAID solution will provide redundancy for your data and increase hard drive performance. The main benefit is that when hot-swappable drive arrays are used, hard drives that have failed can be replaced “on-the-fly” and all data that was on the drive is automatically rebuilt from the data and parity information written to the other drives in the array. The pictures below effectively show examples of RAID solutions by replicating their strategies with water coolers.

A second important consideration for your organization is your “backup plan.” This protects against catastrophic data loss. The backup plan should define what data should be regularly backed up, the frequency of the backups, and who is responsible for the backups. The most important component of this plan should be a rigid procedure that ensures an off-site copy of the archive is always kept. The off-site backup ensures that your data is safe even if a tornado, fire, hurricane, theft, or other disaster wipes out your main data drive.

Part of the backup plan is specification of fault tolerant uninterruptible power supplies (UPS) on critical servers and computers. These will keep this equipment running when power is lost or interrupted and, more importantly, provides for an automated and orderly shut down of all equipment if the power is not restored before the UPS batteries are depleted of power.

Finally, if your data center is housed in a room that requires cooling, it is very important to provide for redundant cooling. Servers and data arrays generate extraordinary heat. They will destroy themselves if they are not kept cool. When the data centers main cooling equipment fails there must be backup equipment to keep your equipment from overheating.

Technology Is Always Changing

We have only touched on the basics of a digital geospatial environment, but with ever-changing products, data, and services, there is always new technological discussions required. Here are a few suggestions to help in the future:

  • Research your needs and then confidently buy the best technology you can afford.
  • Keep upgradability in mind. What is current now may not be current in the near future.
  • Share your experiences with others and seek out new ideas.
  • Learn how to effectively use search engines to find solutions to problems you cannot solve on your own.

If you have any questions or comments, please feel free to contact Aerial Services, Inc. for assistance.

Tyler Coan was Information Systems Manager at Aerial Services, Inc. in 2007. Tyler graduated with a degree in Network Administration and Engineering from Hawkeye Community College in Waterloo, Iowa. Currently, his certifications include MCP, Network+, and A+.