C#|.NET : Size String, Truncate Lines

Swiss Crop Circle 2009 Aerial by Kecko, on Flickr
Swiss Crop Circle 2009 Aerial, a photo by Kecko on Flickr.

I needed to truncate lines from CSV string prepared for export through mail in my app. Windows Phone mail restricts text size to 1MB. In my app’s case some CSV reports could go beyond 1MB. To make sure that the CSV report does not exceed a desired size in bytes, I created this extension method – SizeIt. SizeIt can truncate lines (and characters if required) from a given string from beginning or from end and also inserts information text line (how many lines and characters removed) in the resulting string. Following is the test form created to show what the method does to a string. In this example a 300 byte 10 line string is given as input and asked to reduce it to <200 bytes, first from beginning, and then from end.

SizeTail

SizeFront

Here is the code for the extension method:

    public static class StringBuilderExtensions
    {
        /// <summary>
        /// Sizes StringBuilder to given size
        /// </summary>
        /// <param name="sb">StringBuilder</param>
        /// <param name="bytes">Maximum Size of resultant string. Mostly the size of string will be less than max size</param>
        /// <param name="removeLinesFromBeginning">Pass true, if clipping has to happen in the beginning</param>
        public static void SizeIt(this StringBuilder sb, int bytes, bool removeLinesFromBeginning)
        {
            bool _stringReplacementHappened = false;
            bool _maxBytesAdjusted_for_lines = false;
            bool _maxBytesAdjusted_for_chars = false;
            string _insertThismessage = "";
            int _linesRemoved = 0;
            int _charsRemoved = 0;
            //try removing lines.
            while (System.Text.Encoding.Unicode.GetByteCount(sb.ToString()) > bytes)
            {
                char _alternateNewLine = '\r';
                string _workingString = sb.ToString();
                if (!_workingString.Contains(Environment.NewLine) && !_workingString.Contains(_alternateNewLine))
                {
                    break;
                }
                int _newlinelocation = sb.ToString().IndexOf(Environment.NewLine);
                int _lastNewlinelocation = sb.ToString().LastIndexOf(Environment.NewLine);
                if (_newlinelocation <= 0)
                {
                    _newlinelocation = sb.ToString().IndexOf(_alternateNewLine);
                }
                if (_lastNewlinelocation <= 0)
                {
                    _lastNewlinelocation = sb.ToString().LastIndexOf(_alternateNewLine);
                }

                if (removeLinesFromBeginning)
                {
                    sb = sb.Remove(0, _newlinelocation + 1);
                    _linesRemoved++;
                    _stringReplacementHappened = true;
                    _insertThismessage = string.Format("... {0} lines removed{1}", _linesRemoved, Environment.NewLine);
                }
                else
                {
                    int _lastReturnAt = _lastNewlinelocation;
                    int _charsInLastLine = sb.Length - _lastNewlinelocation;
                    sb = sb.Remove(_lastReturnAt, _charsInLastLine);
                    _linesRemoved++;
                    _stringReplacementHappened = true;
                    _insertThismessage = string.Format("{1}{0} lines removed...", _linesRemoved, Environment.NewLine);
                }
                if (!_maxBytesAdjusted_for_lines)
                {
                    bytes -= 50; //40 extra bytes for information text
                    _maxBytesAdjusted_for_lines = true;
                }
            }
            //if it's still more than the desired size
            if (System.Text.Encoding.Unicode.GetByteCount(sb.ToString()) > bytes)
            {
                int _currentBytes = System.Text.Encoding.Unicode.GetByteCount(sb.ToString());
                if (!_maxBytesAdjusted_for_lines)
                {
                    bytes -= 100;
                    _maxBytesAdjusted_for_lines = true;
                    _maxBytesAdjusted_for_chars = true;
                }
                else if (!_maxBytesAdjusted_for_chars)
                {
                    bytes -= 50;
                    _maxBytesAdjusted_for_chars = true;
                }
                int _removeChars = (_currentBytes - bytes) / 2;
                if (removeLinesFromBeginning)
                {
                    sb = sb.Remove(0, _removeChars);
                    _stringReplacementHappened = true;
                    _insertThismessage = string.Format("... {0} lines and {1} chars removed{2}", _linesRemoved, _removeChars, Environment.NewLine);
                }
                else
                {
                    sb = sb.Remove(sb.Length - _removeChars, _removeChars);
                    _stringReplacementHappened = true;
                    _insertThismessage = string.Format("{2}{0} lines and {1} chars removed...", _linesRemoved, _removeChars, Environment.NewLine);
                }
            }
            if (_stringReplacementHappened)
            {
                if (removeLinesFromBeginning)
                {
                    sb.Insert(0, _insertThismessage);
                }
                else
                {
                    sb.Append(_insertThismessage);
                }
            }
        }
    }

You may extend SizeIt to make it even smarter because it has following shortcomings in current version:

  • Does not truncate very closely to Max size. Will always be less than 50 bytes from max.
  • If there are only two lines and line of truncating side is huge, even a slight reduction in size will remove the complete content of the huge line, meaning loosing most of the text. This could be taken care of by firsts analyzing the string and then truncating line/character, as required
  • The logic inherently tries to remove lines first, should intelligently decide whether to remove line or chars.
Advertisements

One thought on “C#|.NET : Size String, Truncate Lines

  1. Pingback: Dew Drop – The Return – April 22, 2014 (#1760) | Morning Dew

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s