0% found this document useful (0 votes)
54 views16 pages

File Formats Assignment

g

Uploaded by

moregaurav
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
54 views16 pages

File Formats Assignment

g

Uploaded by

moregaurav
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 16

File Format:

A file format is a standard way that information is encoded for storage in


a computer file. It specifies how bits are used to encode information in a digital
storage medium. File formats may be either proprietary or free and may be either
unpublished or open.
In simple terms, a file format describes the way information is organized In a
computer file. File formats apply to documents, images, audio files , video files and
research data sets eg. .doc or.pdf
Image File Formats:
TIFF is, in principle, a very flexible format that can be lossless or lossy. The details
of the image storage algorithm are included as part of the file. In practice, TIFF is
used almost exclusively as a lossless image storage format that uses no
compression at all. Most graphics programs that use TIFF do not compression.
Consequently, file sizes are quite big. (Sometimes a lossless compression
algorithm called LZW is used, but it is not universally supported.)
PNG is also a lossless storage format. However, in contrast with common TIFF
usage, it looks for patterns in the image that it can use to compress file size. The
compression is exactly reversible, so the image is recovered exactly.
GIF creates a table of up to 256 colors from a pool of 16 million. If the image has
fewer than 256 colors, GIF can render the image exactly. When the image
contains many colors, software that creates the GIF uses any of several algorithms
to approximate the colors in the image with the limited palette of 256 colors
available. Better algorithms search the image to find an optimum set of 256
colors. Sometimes GIF uses the nearest color to represent each pixel, and
sometimes it uses "error diffusion" to adjust the color of nearby pixels to correct
for the error in each pixel.
GIF achieves compression in two ways. First, it reduces the number of colors of
color-rich images, thereby reducing the number of bits needed per pixel, as just
described. Second, it replaces commonly occurring patterns (especially large
areas of uniform color) with a short abbreviation: instead of storing "white, white,
white, white, white," it stores "5 white." Thus, GIF is "lossless" only for images
with 256 colors or less. For a rich, true color image, GIF may "lose" 99.998% of the
colors.
JPG is optimized for photographs and similar continuous tone images that contain
many, many colors. It can achieve astounding compression ratios even while
maintaining very high image quality. GIF compression is unkind to such images.
JPG works by analyzing images and discarding kinds of information that the eye is
least likely to notice. It stores information as 24 bit color. Important: the degree
of compression of JPG is adjustable. At moderate compression levels of
photographic images, it is very difficult for the eye to discern any difference from
the original, even at extreme magnification. Compression factors of more than 20
are often quite acceptable. Better graphics programs, such as Paint Shop Pro and
Photoshop, allow you to view the image quality and file size as a function of
compression level, so that you can conveniently choose the balance between
quality and file size.
RAW is an image output option available on some digital cameras. Though
lossless, it is a factor of three of four smaller than TIFF files of the same image.
The disadvantage is that there is a different RAW format for each manufacturer,
and so you may have to use the manufacturer's software to view the images.
(Some graphics applications can read some manufacturer's RAW formats.)
BMP is an uncompressed proprietary format invented by Microsoft. There is
really no reason to ever use this format.
PSD, PSP, are proprietary formats used by graphics programs. Photoshop's files
have the PSD extension, while Paint Shop Pro files use PSP. These are the
preferred working formats as you edit images in the software, because only the
proprietary formats retain all the editing power of the programs. These packages
use layers, for example, to build complex images, and layer information may be
lost in the nonproprietary formats such as TIFF and JPG. However, be sure to save
your end result as a standard TIFF or JPG, or you may not be able to view it in a
few years when your software has changed.
Audio File Formats:
AIF Format- AIF is an audio format that was developed by Apple Computer. Most
recent browsers, including Microsoft IE and Netscape Navigator, will play an aif
file using the browser's built-in sound player. AIF files have one of these
extensions: .aif, .aiff, .aifc
AU Format- AU is one of the most common audio formats used on the Web. It
was created by Sun Microsystems and is sometimes referred to as "audio/basic"
format. Most browsers support the au format with their internal sound players.
An au-formatted file has this extension: .au
EA Format EA format was created by Geo and named after its Emblaze Audio
creation products. It compresses audio files to a fraction of their original size and
uses a Java applet to play back the file. This is the first audio format that was
created specifically for Web-based audio. Any browser that supports Java will play
an .ea file without requiring additional plug-ins. EA files have this extension: .ea
MIDI Format- MIDI stands for Musical Instrument Digital Interface. MIDI is
typically used to play music. MIDI files contain a set of instruction that your
computer sends to a sound card, synthesizer, or other device. MIDI files contain
information about how to play the sound, rather than recording an actual
rendition of the sound. They contain information about musical notes and which
instruments play those notes. The quality of MIDI sound on the Web depends on
the quality of the MIDI interpreter in the computer's sound card. For example,
MIDI files sound very nice on WebTV because the WebTV MIDI interpreter does a
high-quality job translating and "displaying" the sound. MIDI files have one of
these extensions: .midi, .mid
MP3 Format- The MPEG Layer-3 format is the most popular format for
downloading and storing music. By eliminating portions of the audio file that are
essentially inaudible, mp3 files are compressed to roughly one-tenth the size of
an equivalent PCM file while maintaining good audio quality. It is recommended
for music storage. It is not that good for voice storage. Mp3 files have the
extension: .mp3.

WAV Format- WAV is Microsoft's audio format of choice. Since Windows 3.1,
WAV has been the native format for sound within the Windows environment.
Needless to say, this makes it one of the most common sound formats on the
Web. Most browsers support the .wav format with their internal sound players. A
.wav-formatted file has this extension: .wav
Video File Formats:
ASF,WMV (Advanced Streaming Format) (Windows Media Video)
Can be streamed across the Internet and viewed before entire file has been
downloaded when using a Windows Media server.
Requires Windows Media Player be installed.
Typically placed on an internet streaming server.
AVI(Audio Video Interleave)
Can be viewed with standard Windows Players such as Windows Media Player.
Uncompressed yields a high quality video but uses a lot of storage space.
If downloading from the Internet, the entire file must be downloaded before
being played.
Typically stored on local hard disk or CDROM or made available as download
from web server.
MOV(Apple Quicktime Movie)
Requires Apples Quicktime Movie Player
Depending on Compression chosen can provide a very high quality video clip,
but better quality uses more storage space.
Can be streamed across the Internet and viewed before entire file has been
downloaded if using a Quicktime streaming server.
Can be placed both on an internet streaming server, or local storage such as
hard disk or CDROM.
MPEG (Motion Pictures Experts Group)
Can provide VHS quality movies or better
Mpeg1 is equal to VHS. Mpeg2 is better than VHS and used for DVD
Mpeg4 is best quality Requires an MPEG player to view
If downloading from the Internet, the entire file must be downloaded before
being played because files sizes are very large.
Typically MPEG2 is used to make DVD movies
Can be placed on any storage media large enough to hold the file, but at current
time the internet speed will not support streaming MPEG files. Any MPEG files
found on the Web will have to be downloaded to the local drive and played using
an MPEG player.
RM(Real Media)
Can be streamed across the Internet and viewed before entire file has been
downloaded when
using a Real Networks Streaming server.
Has very high compression, but at a cost to quality.
Requires RealPlayer to view content.
gsm - designed for telephony use in Europe, gsm is a very practical format for
telephone quality voice. It makes a good compromise between file size and
quality. We recommend this format for voice. Note that wav files can also be
encoded with the gsm codec. See here for a sample gsm encoded wav file.
Sample .gsm file.


dct - A variable codec format designed for dictation. It has dictation header
information and can be encrypted (often required by medical confidentiality
laws). The standard dct player is the Express Scribe Transcription Player.



flac - a lossless compression codec. You can think of lossless compression as
like zip but for audio. If you compress a PCM file to flac and then restore it
again it will be a perfect copy of the original. (All the other codecs discussed
here are lossy which means a small part of the quality is lost). The cost of this
losslessness is that the compression ratio is not good. But we recommend flac
for archiving PCM files where quality is important (eg. broadcast or music
use). Sample .flac file.


au - the standard audio file format used by Sun, Unix and Java. The audio in au
files can be PCM or compressed with the ulaw, alaw or G729 codes. Sample .au
file.

aif - the standard audio file format used by Apple. It is like a wav file for the
Mac. Sample .aif file.


vox - the vox format most commonly uses the Dialogic ADPCM (Adaptive
Differential Pulse Code Modulation) codec. Similar to other ADPCM formats, it
compresses to 4-bits. Vox format files are similar to wave files except that the
vox files contain no information about the file itself so the codec sample rate
and number of channels must first be specified in order to play a vox file. Vox a
very old file type and is pretty poor. We do not recommend it for anything
except for supporting legacy systems. Sample .vox file.

raw - a raw file can contain audio in any codec but is usually used with PCM
audio data. It is rarely used except for technical tests. Sample .raw file.


Document File Formats:
Ascll
The American Standard Code for Information Interchange is a character-encoding
scheme originally based on the English alphabet that encodes 128
specified characters - the numbers 0-9, the letters a-z and A-Z, some
basic punctuation symbols, some control codes that originated with Teletype
machines, and a blank space - into the 7-bit binary integers.
ASCII codes represent text in computers, communications equipment, and other
devices that use text. Most modern character-encoding schemes are based on
ASCII, though they support many additional characters.
AmigaGuide
AmigaGuide is a hypertext document file format designed for the Amiga, files are
stored in ASCII so it is possible to read and edit a file without the need for special
software.
Since Workbench 2.1 an Amiga Guide system for O.S. inline help files and reading
manuals with sort of hypertext formatting elements was launched in AmigaOS
and based on a viewer called simply "AmigaGuide" and it has been included as
standard feature on the Amiga system. Users with earlier versions of Workbench
could view the files by downloading the program and library AmigaGuide
34 distributed with public domain collections of floppy disks (for example on Fred
Fish collection) or it could be downloaded directly from Aminet Amiga Official
Repository on the web. Starting from AmigaOS 3.0 the AmigaGuide tool was
replaced with more the complete and flexible MultiView.
Doc (.doc)
In computing, DOC or doc (an abbreviation of 'document') is a filename
extension for word processing documents, most commonly in the Microsoft
Word Binary File Format. Historically, the extension was used for documentation
in plain text, particularly of programs or computer hardware on a wide range
of operating systems. During the 1980s, WordPerfect used DOC as the extension
of their proprietary format. Later, in the 1990s, Microsoft chose to use the DOC
extension for their proprietary Microsoft Word format. The original uses for the
extension have largely disappeared from the PC world.

DocBook-(.dbk)
DocBook is a semantic markup language for technical documentation. It was
originally intended for writing technical documents related to computer hardware
and software but it can be used for any other sort of documentation.
As a semantic language, DocBook enables its users to create document content in
a presentation-neutral form that captures the logical structure of the content;
that content can then be published in a variety of formats,
including HTML, XHTML, EPUB, PDF, man pages, Web help

and HTML Help,
without requiring users to make any changes to the source.
HTML (.html)
Hypertext Markup Language (HTML) is the main markup language for
creating web pages and other information that can be displayed in a web
browser.
The purpose of a web browser is to read HTML documents and compose them
into visible or audible web pages. The browser does not display the HTML tags,
but uses the tags to interpret the content of the page.
HTML elements form the building blocks of all websites. HTML allows images and
objects to be embedded and can be used to create interactive forms. It provides a
means to create structured documents by denoting structural semantics for text
such as headings, paragraphs, lists, links, quotes and other items. It can
embed scripts written in languages such as JavaScript which affect the behavior of
HTML web pages.

Open XML(.oxps)
Open XML Paper Specification (also referred to as OpenXPS) is an
open specification for a page description language and a fixed-document
format. Microsoft developed it as the XML Paper Specification (XPS).
It is an XML-based (more precisely XAML-based) specification, based on a new
print path (print processing data representation and data flow) and a color-
managed vector-based document format that supports device
independence and resolution independence.

An XPS file is, in fact, a Unicoded ZIP archive using the Open Packaging
Conventions, containing the files which make up the document. These include an
XML markup file for each page, text, embedded fonts, raster images, 2D vector
graphics, as well as the digital rights management information. The contents of an
XPS file can be examined simply by opening it in an application which supports ZIP
files.
PDF (.PDF)
Portable Document Format (PDF) is a file format used to represent documents in
a manner independent of application software, hardware, and operating
systems. Each PDF file encapsulates a complete description of a fixed-layout flat
document, including the text, fonts, graphics, and other information needed to
display it. In 1991, Adobe Systems co-founder John Warnock outlined a system
called "Camelot" that evolved into PDF.
Text in PDF is represented by text elements in page content streams. A text
element specifies that characters should be drawn at certain positions. The
characters are specified using the encoding of a selected font resource.
XLS (.xls)
XLS is a file extension for a spreadsheet file format created by Microsoft for use
with Microsoft excel .XLS stands for eXcel spreadsheet. Microsoft Excel files use a
proprietary format for storing Microsoft excel document. This file format is known
as the Binary Interchange File Format(BIFF). XLS files can also be opened by the
Microsoft Excel Viewer and Open Office.

Office Open XML (.docx) Office Open XML (also informally known
as OOXML or OpenXML) is a zipped, XML-based file format developed by
Microsoft for representing spreadsheets, charts, presentation and word
processing document.



Power Point File formats:
.ppt
A slide show that you can open in PowerPoint 97 to Office PowerPoint 2003.
PowerPoint Presentation (.pptx)
The PPTX file extension is given to Microsoft Power Point files that are created in
PowerPoint versions 2007 and later. Microsoft PowerPoint is a presentation
software that allows users to create slide shows containing pictures, text, music
and video. The PowerPoint program is included in the Microsoft Office Suite.

Earlier versions of Microsoft PowerPoint create files using the .pptx extension.
The .pptx extension is being assigned by later versions of PowerPoint because
these versions of the software use the Open XML format. This format allows
PowerPoint files to be compressed into smaller sizes, making it easier to
distribute the files online and taking up less drive space on a user's computer.
.pptm
Files with the .pptm extension are most commonly associated with the Microsoft
PowerPoint presentation software. The PPTM files feature macro-enabled
presentations that have been created by the software. The PPTM files contain a
collection of presentation slides containing images, text, movies, sound effects
and embedded macros. These files are similar to PPTX files, but PPTX files do not
contain embedded macros. The PPTM file format is based on the Open XML
document format introduced by Microsoft in 2007.
.potx
Files that contain the .potx file extension are used by the PowerPoint
presentation software application. PowerPoint is a program that allows users to
create dynamic presentations and The POTX files that are used by the PowerPoint
software application contain PowerPoint templates. These templates allow a user
to save default file settings in the PowerPoint application, allowing the user to re-
apply the same layout settings across multiple PowerPoint presentation files
without having to create each file from scratch. Companies often use the POTX
file format to create standard master slides which contain the company's header,
footer and logo The POTX file format is based on the Open XML format, which
was only included in versions 2007 and later of the PowerPoint product.
.potm
A file with the POTM file extension is a Microsoft PowerPoint Macro-Enabled
Design Template file. A template that includes pre-approved macros that you can
add to a template to be used in a presentation.
OpenDocument Presentation (.odp)
Use to save PowerPoint 2010 files so they can be opened in presentation
applications that use the Open Document Presentation format, such as Google
Docs and OpenOffice.org Impress. You can also open presentations in the .odp
format in PowerPoint 2010. Some information might be lost when saving and
opening .odp files.
Outline/RTF (.rtf)
A presentation outline as a text-only document provides smaller file sizes and the
ability to share macro-free files with others who may not have the same version
of PowerPoint or the operating system that you have. Any text in the notes pane
is not saved with this file format.
Windows Media Video (wmv)
A presentation that is saved as a video. PowerPoint 2010 presentations can be
saved at High Quality (1024 x 768, 30 frames per second); Medium Quality (640 x
480, 24 frames per sec); and Low Quality (320 X 240, 15 frames per second).
The WMV file format plays on many media players, such as Windows Media
Player.
Email File Formats :

eml
Used by many email clients including Microsoft Outlook Express, Lotus
notes, Windows Mail, Mozilla Thunderbird, and Postbox. The files are plain
text in MIME format, containing the email header as well as the message contents
and attachments in one or more of several formats.
emlx
Used by Apple Mail.
A file with the EMLX file extension is an Apple Mail Email file. The first option
involving opening the EMLX file in its native program is preferable because it's
both easier and will probably result in a more accurate file conversion. Of course
if you don't have a program that opens EMLX files, a third-party file conversion
tool (the second option) could be very useful.
msg
used by Microsoft Office Outlook and Office Logic Groupware.
MSG is a file extension for a mail message file format used by Microsoft Outlook
and Exchange. An MSG file can contain plain ASCII text for the headers and the
main message body as well as hyperlinks and attachments.
MSG files may be exported for the purposes of archiving and storage or scanning
for malware.
mbx
MBX is an email extension file format created by Microsoft Outlook Express.
Outlook stores message data in MBX files on the users computer. MBX files are
usually held in a folder that corresponds to a folder within Outlook, such as Inbox
or Sent. The folders also have the .MBX extension.


Text File Formats:

ASCII
The ASCII standard allows ASCII-only text files (unlike most other file types) to be
freely interchanged and readable on Unix, Macintosh, Microsoft Windows, DOS,
and other systems. These differ in their preferred line ending convention and
their interpretation of values outside the ASCII range (their character encoding).
UTF-8
In English context text files can be uniquely ASCII, when in an international
context text files are usually 8 bits permissive allowing storage of native texts.
In those international context, a Byte Order Mark can appear in start of file to
differentiate UTF-8 encoding from legacy regional encoding.
[

.TXT
.txt is a file format for files consisting of text usually containing very little
formatting (e.g., no bolding or italics). The precise definition of the .txt format is
not specified, but typically matches the format accepted by the system terminal
or simple text editor. Files with the .txt extension can easily be read or opened by
any program that reads text and, for that reason, are considered universal
(or platform independent).
standard Windows .txt files
MS-DOS and Windows use a common text file format, with each line of text
separated by a two-character combination: CR and LF, which have ASCII codes 13
and 10. It is common for the last line of text not to be terminated with a CR-LF
marker, and many text editors (including Notepad) do not automatically insert
one on the last line.
Most Windows text files use a form of ANSI, OEM or Unicode encoding. What
Windows terminology calls "ANSI encodings" are usually single-byte ISO-8859
encodings, except for in locales such as Chinese, Japanese and Korean that
require double-byte character sets. ANSI encodings were traditionally used as
default system locales within Windows, before the transition to Unicode. By
contrast, OEM encodings, also known as MS-DOS code pages, were defined by
IBM for use in the original IBM PC text mode display system. They typically
include graphical and line-drawing characters common in (possibly full-screen)
MS-DOS applications. Newer Windows text files may use a Unicode encoding such
as UTF-16LE or UTF-8, with Byte Order Mark.
Accounting File Formats:
ASC File
The ASC file is the most complex and the most important file that the
Simply Accounting application uses. It can be internally divided into eight parts:
SN_REC Header (application defaults)
General Ledger Accounts (GYN records)
Payable Ledger Accounts (VYD and VYN records)
Receivable Ledger Accounts (VYD and VYN records)
Payroll Ledger Accounts (MYN or USMYN records)
Inventory Ledger Accounts (INV records)
Job Cost Ledger Accounts (JYN records)
Journal Entry Indices (JIDXREC records)
ASJ File
The ASJ file contains journal entries. The journal is a history of changes made to
the general ledger accounts. Every journal record has a JYD header that indicates
its date, type, and source. The JYD header is followed by a series of ENTJ records
that store changes to individual accounts. Invoices are transactions that modify
accounts. Therefore, each VYD record in the ASC file will have a corresponding
journal record in the ASJ file.
IT0 File
The IT0 file contains a series of IT_INV_EXT records that individually complement
the inventory ledger accounts in the ASC file. A new IT_INV_EXT record is created
for every tracked part number. If any record in the IT0 file disagrees with an INV
record in the ASC file, then the Simply Accounting application will discard the
entire IT0 file.



IT2 File
Invoices are broken into two parts that are stored in the IT2 file and the IT3 file.
The IT2 file stores inventory tracking records. The IT3 file stores inventory lookup
records. The IT2 file design is almost identical to the IT3 file design. Invoices are
represented by one record in the IT2 file, and one record in the IT3 file, together.

IT3 File
The IT3 file stores inventory lookup records. Each inventory lookup record is an
IT_LOOKUP header followed by a series of IT_LOOKUP_LINE records, which
represent invoice line items, and IT_DIST records, which represent distributed line
items. Distributed line items are special instances of line items that contribute to
more than one general ledger account. Note that distributed line items are not
reflected in the IT2 file.

IT4 File
The IT4 file is an array of type IT_INDEXREC that points into the IT2 file. Every IT4
record points to one corresponding IT2 record. If the IT2 file is ever deleted by the
Simply Accounting application, then the IT4 file is also deleted. The Simply
Accounting application might use the IT4 file to begin the generation of invoice
reports because IT4 records contain the invoice source, the invoice total, the
journal entry number, and the journal posting date of a transaction.




References:

http://www.umpi.edu/files/faculty-staff/itss/workshops/CommonAudioFiles.pdf

http://en.wikipedia.org/wiki/Document_file_format

http://www.marcellusschools.org/tfiles/folder967/Video%20Formats.pdf

http://office.microsoft.com/en-in/powerpoint-help/file-formats-that-are-
supported-in-powerpoint-2010-HP010338214.aspx

http://en.wikipedia.org/wiki/Text_file

You might also like