Reference: post in stackoverflow.
Problem: Let us suppose the fortran program stores integer*4, real*8, complex*16 type of data as a binary files. How one can read the binary files in python?
(1) In Fortran : store data as a binary file
In this example, an real array 'fn' with dimension(100,100) is stored in a file.
Because the double precision real number is 8 byte, writing 'fn' corresponds to writing 8*100*100= 80000 bytes. When writing binary file ('direct' access), one have to specify a length of data and position to write the data. 'rec_array' is a length of data and 'rec' in write command tells where the 'rec_array' length have to be written in a file. For example, rec = 3 corresponds to the position after 2*80000 bytes.
program binary_out
implicit none
integer :: i,j,t,rec_array
double precision, dimension(100,100) :: fn
double precision, parameter :: p=2*3.1415929
INQUIRE(IOLENGTH=rec_array) fn
write(*,*) 'rec_array=',rec_array ! 100*100*8 byte
open(unit=10,file='temp.dat',status='unknown', &
form='unformatted',access='direct',recl=rec_array)
fn=0
write(10,rec=1) fn
do t=1,3
do i=1,100 ! filling all fn(i,j)
do j=1,100
fn(i,j)=sin(i*p*t/100)*cos(j*p*t/100)
enddo
enddo
write(10,rec=t+1) fn ! t-th rec_array size
enddo
! rec=1-4 fn
! Thus total size of file is 80000*4 bytes
close(10)
end program binary_out
Note: Writing to disk is actually done when the buffer is full or when the file is closed.
To write in real-time, one have to use "flush(10)"
(2) In python, one can open/read the file as a binary. One have to specify the type/size of the data to read (using 'np.fromfile') and position to start reading the data(using 'f.seek') . (Note that one cannot read the same position twice. One have to specify the starting position for each reading.)
#!/usr/bin/env python
import scipy
import glob
import numpy as np
import matplotlib.pyplot as plt
import os, sys
from pylab import *
# binary file is written as (8byte real)*(100*100 elements)*(4 cycles)
def readslice(inputfilename,field,nx,ny,timeslice):
f = open(inputfilename,'rb') # open as binary
f.seek(8*timeslice*nx*ny) # find starting byte position for reading
field = np.fromfile(f,dtype='float64',count=nx*ny) # 8byte = 64 bit
field = np.reshape(field,(nx,ny))
f.close()
return field
a=np.dtype('d')
a=readslice('temp.dat',a,100,100,1)
print(a[0])
im=plt.imshow(a)
plt.show()
In other words, the binary file have to be write/read by specifying the starting position of the data and length and type of the data. ( Since binary is a collection of '0' and '1', same binary can be read in different ways depending on the length and format.)
It is important to know the exact type/order of the data to read/write.
Note: In case of multi-dimensional array from fortran, reshaping could be different from original one because of the convention difference. To have the same indexing, one can use '"order='F' " in the np.reshape. For example, fortran wrote a array 'f(1:Nx,1:Ny,1:Nz)' to get the same array in python 'f(0:Nx-1,0:Ny-1,0:Nz-1)', use
a = np.fromfile(f,dtype='float64',count=Nx*Ny*Nz)
a = np.reshape(a,(Nx,Ny,Nz),order='F')
In case of c, order='C'
댓글 없음:
댓글 쓰기