A Guide to Getting the Best Performance with Large Datasets

Issue link: https://geospatial.trimble.com/en/resources/i/1415711

Contents of this Issue


Page 7 of 12

Trimble Software: Working With Large Datasets 8 Different Approaches to Storage & Processing There are different approaches available to address the needs of each step of the workflows for working with large datasets. Some common options are described below, along with pros and cons for each method. Local PC Local Network Server Cloud Pros ● Most cost effective overall (not paying for data access and processing time). ● Low latency using a good graphics card provides the most responsive processing experience. ● Processing time and data transfer are usually fastest when using a local machine. ● Can use RAID arrays to improve read/write speed or provide security against data loss. ● See: https://www.pcworld.com/articl e/194360/raid-made-easy.html Pros ● Very fast transfers to local machines are possible with the right data link (e.g., direct Ethernet or fiber connections to a small group of users' machines). ● Data sharing is easiest across a network, but it is not always the fastest method within a single office. ● Recent advances in gaming technology (e.g., NVidia GameStream) allow streaming across a network with low latency from one networked PC to another (remote desktop). Pros ● Virtually unlimited storage at low prices. ● It would be possible to charge customers different rates to process their data, based on quickness of delivery. ● Sharing is easily accomplished in the cloud. ● Scalable services may be available to quickly process large datasets, using multiple processors and large amounts of storage available in the cloud. Cons ● While PC's can be configured with large amounts of storage, most laptops have limited on- machine storage capability. This means they require the addition of external drives to scale overall capacity. ● Need a powerful (typically expensive) PC to process the data. ● Separate fast drives are needed for the temporary storage of data during a project. ● Archive drives need to be large and plentiful. ● Data Sharing – local storage limits ability for others to use/work on the data. Cons ● Transfer of data to a local machine is usually needed to process data because working on a local PC with data stored on a network is usually too slow, even with really fast data links. ● For processing of datasets using game streaming to be efficient, the data and the processing software both need to be on the networked streaming PC. Also, only one user at a time can use the PC remotely. ● Network is best as an archiving location. Cons ● Generally slow upload and download speeds. ● Primarily an archiving solution today ─ would still typically need to download data for processing. ● Processing in the cloud could be scalable to dramatically improve speed, but data access and processing time could be costly. A careful balance needs to be struck between time and cost. ● Visualization options for cloud- based data are limited and can have too much latency to be usable for heavy data manipulation (e.g., segmentation, registration checking, etc.).

Articles in this issue

Links on this page

view archives of Whitepapers - A Guide to Getting the Best Performance with Large Datasets