Sparse Matrix Storage Formats: Implementation and Performance Analysis, Slides of Applications of Computer Sciences

The importance of sparse storage formats for large data sets consisting mainly of zeros. The project aims to extend the existing sparselib++ library with more storage formats and efficient matrix-vector multiplication routines for both sequential and parallel processing. The document also covers various sparse storage techniques, their working, and their performance analysis.

Typology: Slides

2011/2012

Uploaded on 07/18/2012

padmavati
padmavati 🇮🇳

4.6

(24)

154 documents

1 / 46

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
Introduction
Sparse storage formats are of great
importance if we are dealing with large
data mostly consisting of zeros
Sparse storage formats are techniques
for storing and processing matrix data
efficiently
docsity.com
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff
pf12
pf13
pf14
pf15
pf16
pf17
pf18
pf19
pf1a
pf1b
pf1c
pf1d
pf1e
pf1f
pf20
pf21
pf22
pf23
pf24
pf25
pf26
pf27
pf28
pf29
pf2a
pf2b
pf2c
pf2d
pf2e

Partial preview of the text

Download Sparse Matrix Storage Formats: Implementation and Performance Analysis and more Slides Applications of Computer Sciences in PDF only on Docsity!

Introduction

  • Sparse storage formats are of great

importance if we are dealing with large

data mostly consisting of zeros

  • Sparse storage formats are techniques

for storing and processing matrix data

efficiently

Project Aim

  • The existing library for sparse matrices (SparseLib++) consists of only three sparse storage techniques and routines for their matrix-vector multiplication. The aim of doing this project is to implement a sparse library that will have more storage formats and matrix-vector multiplication in both sequential and parallel
  • For performance evaluation I apply different test matrices on these storage techniques, taken from matrix market

Cont…

  • Depending on the organization of data first we will convert the network data from dense to some storage format that suits it best
  • Now to resolve different issues like allocating network resources we will have to apply different operations on that data
  • The conversion to sparse format will make these operations efficient in a reduced storage space.

Cont…

  • If there are some more users to be

added in to the existing network, then

it becomes a growing matrix size

problem called expanding matrices,

that can be easily resolved in sparse

environment

  • If the network is yet to be initialized, it

falls into the category of dynamic

matrices.

Cont…

  • Diagonal storage format
  • Jagged diagonal storage format
  • Transpose jagged Diagonal storage format
  • Java sparse array
  • Skyline (symmetric)
  • The block entry storage formats that I have implemented are:
  • Block Coordinate storage format
  • Block Compressed Row format
  • Block compressed Column format

Cont…

  • Apart from these techniques that are

included in the study are:

  • Primary storage format
  • Modified Compressed Row
  • Modified Compressed Column
  • Ellpack-itpack storage format
  • Bi-jagged diagonal storage format
  • Block Ellpack storage format
  • Block diagonal storage format

Analysis

  • There are two factors that influence

the performance and storage size of

sparse matrices. They are:

  • Number of Non-zeros
    • The number of non zeros has the same effect on all the storage formats. It is obvious that if there are more nonzeros in the matrix it will consume more storage size and processing time.

Cont…

  • Those storage techniques that have a direct access to values like COO will be more affected in storage size as compared to processing time. While other techniques like CSR and CSC that access values indirectly using pointers will be more affected in processing time as compared to storage size

Type of matrices and the appropriate techniques

  • If the nonzeros are distributed in few dense rows then CSR technique is advised

Cont…

  • If the nonzeros are concentrated into few dense columns the CSC is advised

Cont…

  • If the nonzeros are randomly spread

and there are no patterns to exploit

then we can use JDS or TJDS

Cont…

  • If the matrix is symmetric then we can store only one half of the matrix data
  • If we require high efficiency at the cost of storage size we can use skyline technique that embed zeros to the matrix data at their locations
  • If the matrix size is to large to handle efficiently then we have to use block entry storage formats instead of point entry
  • Java sparse array is recently devised technique and has proved very efficient using OO approach

Cont…

  • CSR-vector

Multiplication

Cont…

  • CSC-vector

Multiplication