Issues with iconv command in script

Issues with iconv command in script

Problem Description:

I am trying to create a script which detects if files in a directory have not UTF-8 characters and if they do, grab the file type of that particular file and perform the iconv operation on it.

The code is follows

find  <directory> |sed '1d'><directory>/filelist.txt

while read filename
do
file_nm=${filename%%.*}
ext=${filename#*.}
echo $filename
q=`grep -axv '.*' $filename|wc -l`
echo $q
r=`file -i $filename|cut -d '=' -f 2`
echo $r
#file_repair=$file_nm
if [ $q -gt 0 ]; then
iconv -f $r -t utf-8 -c ${file_nm}.${ext} >${file_nm}_repaired.${ext}

mv ${file_nm}_repaired.${ext} ${file_nm}.${ext}

fi
done< <directory>/filelist.txt

While running the code, there are several files that turn into 0 byte files and .bak gets appended to the file name.

ls| grep 'bak' | wc -l

36

Where am I making a mistake?

Thanks for the help.

Solution – 1

It’s really not clear what some parts of your script are supposed to do.

Probably the error is that you are assuming file -i will output a string which always contains =; but it often doesn’t.

find  <directory> |
# avoid temporary file
sed '1d' |
# use IFS='' read -r
while IFS='' read -r filename
do
    # indent loop body
    file_nm=${filename%%.*}
    ext=${filename#*.}
    # quote variables, print diagnostics to stderr
    echo "$filename" >&2
    # use grep -q instead of useless wc -l; don't enter condition needlessly; quote variable
    if grep -qaxv '.*' "$filename"; then
        # indent condition body
        # use modern command substitution syntax, quote variable
        # check if result contains =
        r=$(file -i "$filename")
        case $r in
          *=*)
            # only perform decoding if we can establish encoding
            echo "$r" >&2
            iconv -f "${r#*=}" -t utf-8 -c "${file_nm}.${ext}" >"${file_nm}_repaired.${ext}"        
            mv "${file_nm}_repaired.${ext}" "${file_nm}.${ext}" ;;
          *)
            echo "$r: could not establish encoding" >&2 ;;
        esac
    fi
done

See also Why is testing “$?” to see if a command succeeded or not, an anti-pattern? (tangential, but probably worth reading) and useless use of wc

The grep regex is kind of mysterious. I’m guessing you want to check if the file contains non-empty lines? grep -qa . "$filename" would do that.

Rate this post
We use cookies in order to give you the best possible experience on our website. By continuing to use this site, you agree to our use of cookies.
Accept
Reject