Category: Data Visualisation

  • Convert .fa FASTA file to Excel / Text Tab Delimited format

    Convert .fa FASTA file to Excel / Text Tab Delimited format

    Just recently i was requested by my friend to help convert a text document to an Excel sheet which could be understood easily. Firstly i could not directly open the file in Notepad Or Excel as it was not properly supported.

    Later when i forced it open in Notepad, it was just massive strings of text with nothing to interpret. I wanted to know what the file format is. So .fa file format is

    FASTA is a DNA and protein sequence alignment software package first described (as FASTP) by David J. Lipman and William R. Pearson in 1985.[1] Its legacy is the FASTA format which is now ubiquitous in bioinformatics.     Source: Wikipedia

    I had never encountered such file format and was excited to know something in DNA sequencing.

     

    Google is the best tool to get the directions to a solution, i searched google with keywords to convert .fa file to excel. However I realised there was ni ready tool or technique to convert such file. Yes, you could manually edit each of 1500 lines and spend 2-3 hours to get it done without being sure of the accuracy.

     

    While googling I stumbled upon a youtube video shows the same intended conversion in some Orange software. Initially i was not sure what software it is, but after switching to HD mode, i realised the software was Orange Canvas an open source data visualisation software, which you can download at http://orange.biolab.si/

     

     

    These people had created a special widget to be added with orange canvas, It would convert the data from a fasta file to a usable, easily understood form in tabbed format.

     

    Installing Orange was simple and I allowed whatever addon it asked for. Finally Orange setup was completed and the software was installed.

    Following is  a screen shot how orange installation looks.


    (more…)