Eliminación de espacios en blanco adicionales del HTML generado en MVC


Tengo una vista de aplicación MVC que está generando una tabla HTML bastante grande de valores (>20MB).

Estoy comprimiendo la vista en el controlador usando un filtro de compresión

 internal class CompressFilter : ActionFilterAttribute
 {
     public override void OnActionExecuting(ActionExecutingContext filterContext)
     {
         HttpRequestBase request = filterContext.HttpContext.Request;
         string acceptEncoding = request.Headers["Accept-Encoding"];
         if (string.IsNullOrEmpty(acceptEncoding))
             return;
         acceptEncoding = acceptEncoding.ToUpperInvariant();
         HttpResponseBase response = filterContext.HttpContext.Response;
         if (acceptEncoding.Contains("GZIP"))
         {
             response.AppendHeader("Content-encoding", "gzip");
             response.Filter = new GZipStream(response.Filter, CompressionMode.Compress);
         }
         else if (acceptEncoding.Contains("DEFLATE"))
         {
             response.AppendHeader("Content-encoding", "deflate");
             response.Filter = new DeflateStream(response.Filter, CompressionMode.Compress);
         }
     }
 }

¿Hay alguna manera de eliminar también la (bastante grande) cantidad de espacios en blanco redundantes generados en la vista antes de ejecutar el filtro de compresión (para reducir la carga de trabajo y el tamaño de compresión)?

EDITAR: Lo conseguí trabajando usando la técnica WhiteSpaceFilter sugerida por Womp debajo.

Para el interés aquí están los resultados, según lo analizado por Firebug:

1) Sin compresión, sin franja de espacios en blanco - 21MB, 2.59 minutos
2) Con compresión GZIP, sin tira de espacios en blanco-2MB, 17.59 s
3) Con compresión GZIP, banda de espacios en blanco-558kB, 12.77 s

Así que ciertamente vale la pena.

Author: Drew Noakes, 2009-05-13

8 answers

Este tipo escribió un pequeño compactador de espacios en blanco que simplemente ejecuta una copia en bloque rápida de sus bytes a través de una expresión regular para eliminar las manchas de espacio. Él lo escribió como un módulo http, pero usted podría tomar las 7 líneas de código de caballo de batalla fuera de él y colocarlo en su función.

 20
Author: womp,
Warning: date(): Invalid date.timezone value 'Europe/Kyiv', we selected the timezone 'UTC' for now. in /var/www/agent_stack/data/www/ajaxhispano.com/template/agent.layouts/content.php on line 61
2014-08-14 16:00:20

@womp ya ha sugerido una buena manera de hacerlo, pero ese módulo está bastante desactualizado. He estado usando eso, pero resulta que no es una manera óptima. Aquí está la pregunta que hice sobre:

Eliminar el espacio en blanco de todo Html pero dentro de pre con expresiones regulares

Así es como lo hago:

public class RemoveWhitespacesAttribute : ActionFilterAttribute {

    public override void OnActionExecuted(ActionExecutedContext filterContext) {

        var response = filterContext.HttpContext.Response;

        //Temp fix. I am not sure what causes this but ContentType is coming as text/html
        if (filterContext.HttpContext.Request.RawUrl != "/sitemap.xml") {

            if (response.ContentType == "text/html" && response.Filter != null) {
                response.Filter = new HelperClass(response.Filter);
            }
        }
    }

    private class HelperClass : Stream {

        private System.IO.Stream Base;

        public HelperClass(System.IO.Stream ResponseStream) {

            if (ResponseStream == null)
                throw new ArgumentNullException("ResponseStream");
            this.Base = ResponseStream;
        }

        StringBuilder s = new StringBuilder();

        public override void Write(byte[] buffer, int offset, int count) {

            string HTML = Encoding.UTF8.GetString(buffer, offset, count);

            //Thanks to Qtax
            //https://stackoverflow.com/questions/8762993/remove-white-space-from-entire-html-but-inside-pre-with-regular-expressions
            Regex reg = new Regex(@"(?<=\s)\s+(?![^<>]*</pre>)");
            HTML = reg.Replace(HTML, string.Empty);

            buffer = System.Text.Encoding.UTF8.GetBytes(HTML);
            this.Base.Write(buffer, 0, buffer.Length);
        }

        #region Other Members

        public override int Read(byte[] buffer, int offset, int count) {

            throw new NotSupportedException();
        }

        public override bool CanRead{ get { return false; } }

        public override bool CanSeek{ get { return false; } }

        public override bool CanWrite{ get { return true; } }

        public override long Length{ get { throw new NotSupportedException(); } }

        public override long Position {

            get { throw new NotSupportedException(); }
            set { throw new NotSupportedException(); }
        }

        public override void Flush() {

            Base.Flush();
        }

        public override long Seek(long offset, SeekOrigin origin) {

            throw new NotSupportedException();
        }

        public override void SetLength(long value) {

            throw new NotSupportedException();
        }

        #endregion
    }

}
 6
Author: tugberk,
Warning: date(): Invalid date.timezone value 'Europe/Kyiv', we selected the timezone 'UTC' for now. in /var/www/agent_stack/data/www/ajaxhispano.com/template/agent.layouts/content.php on line 61
2017-05-23 10:30:55

Uno puede eliminar los espacios en blanco en tiempo de compilación extendiendo Razor. Eso elimina el (muy significativo por mis mediciones) golpe de tiempo de ejecución de eliminar el espacio en blanco del HTML generado. El golpe es tan grande como 88ms en un i7 de gama alta recortando un documento de 100 KB utilizando código basado en expresiones regulares que se encuentra en el desbordamiento de pila.

Lo siguiente proporciona una implementación de una solución en tiempo de compilación para MVC 3 y MVC 4:

Meleze.Web

Se describe la solución at

Http://cestdumeleze.net/blog/2011/minifying-the-html-with-asp-net-mvc-and-razor /

(pero use el código GitHub o DLL NuGet, ya que el código en la publicación del blog solo cubre MVC 3).

 4
Author: Eric J.,
Warning: date(): Invalid date.timezone value 'Europe/Kyiv', we selected the timezone 'UTC' for now. in /var/www/agent_stack/data/www/ajaxhispano.com/template/agent.layouts/content.php on line 61
2014-08-14 15:58:13

Yo diría que si su vista está generando más de 20 mb de datos, es posible que desee investigar diferentes formas de mostrar los datos, tal vez la paginación?

 2
Author: Tom Anderson,
Warning: date(): Invalid date.timezone value 'Europe/Kyiv', we selected the timezone 'UTC' for now. in /var/www/agent_stack/data/www/ajaxhispano.com/template/agent.layouts/content.php on line 61
2009-05-13 00:34:22
#region Stream filter
class StringFilterStream : Stream
{
  private Stream _sink;
  private Func<string, string> _filter;

  public StringFilterStream(Stream sink, Func<string, string> filter) {
    _sink = sink;
    _filter = filter;
  }

  #region Mixin Properties/Methods
  public override bool CanRead { get { return true; } }
  public override bool CanSeek { get { return true; } }
  public override bool CanWrite { get { return true; } }
  public override void Flush() { _sink.Flush(); }
  public override long Length { get { return 0; } }
  private long _position;
  public override long Position {
    get { return _position; }
    set { _position = value; }
  }
  public override int Read(byte[] buffer, int offset, int count) {
    return _sink.Read(buffer, offset, count);
  }
  public override long Seek(long offset, SeekOrigin origin) {
    return _sink.Seek(offset, origin);
  }
  public override void SetLength(long value) {
    _sink.SetLength(value);
  }
  public override void Close() {
    _sink.Close();
  }
  #endregion

  public override void Write(byte[] buffer, int offset, int count) {
    // intercept the data and convert to string
    byte[] data = new byte[count];
    Buffer.BlockCopy(buffer, offset, data, 0, count);
    string s = Encoding.Default.GetString(buffer);

    // apply the filter
    s = _filter(s);

    // write the data back to stream
    byte[] outdata = Encoding.Default.GetBytes(s);
    _sink.Write(outdata, 0, outdata.GetLength(0));
  }
}
#endregion

public enum WebWhitespaceFilterContentType
{
  Xml = 0, Css = 1, Javascript = 2
}
public class WebWhitespaceFilterAttribute : ActionFilterAttribute
{
  private WebWhitespaceFilterContentType _contentType;

  public WebWhitespaceFilterAttribute() {
    _contentType = WebWhitespaceFilterContentType.Xml;
  }
  public WebWhitespaceFilterAttribute(WebWhitespaceFilterContentType contentType) {
    _contentType = contentType;
  }

  public override void OnActionExecuting(ActionExecutingContext filterContext) {

    var request = filterContext.HttpContext.Request;
    var response = filterContext.HttpContext.Response;

    switch (_contentType) {
      case WebWhitespaceFilterContentType.Xml:

        response.Filter = new StringFilterStream(response.Filter, s => {
          s = Regex.Replace(s, @"\s+", " ");
          s = Regex.Replace(s, @"\s*\n\s*", "\n");
          s = Regex.Replace(s, @"\s*\>\s*\<\s*", "><");
          // single-line doctype must be preserved
          var firstEndBracketPosition = s.IndexOf(">");
          if (firstEndBracketPosition >= 0) {
            s = s.Remove(firstEndBracketPosition, 1);
            s = s.Insert(firstEndBracketPosition, ">\n");
          }
          return s;
        });
        break;

      case WebWhitespaceFilterContentType.Css:
      case WebWhitespaceFilterContentType.Javascript:

        response.Filter = new StringFilterStream(response.Filter, s => {
          s = Regex.Replace(s, @"/\*([^*]|[\r\n]|(\*+([^*/]|[\r\n])))*\*+/", "");
          s = Regex.Replace(s, @"\s+", " ");
          s = Regex.Replace(s, @"\s*{\s*", "{");
          s = Regex.Replace(s, @"\s*}\s*", "}");
          s = Regex.Replace(s, @"\s*;\s*", ";");
          return s;
        });
        break;
    }
  }
}
 2
Author: ,
Warning: date(): Invalid date.timezone value 'Europe/Kyiv', we selected the timezone 'UTC' for now. in /var/www/agent_stack/data/www/ajaxhispano.com/template/agent.layouts/content.php on line 61
2009-06-17 04:04:44

Aquí hay un VB.NET versión de un atributo de filtro de espacios en blanco que estoy usando en un proyecto:

#Region "Imports"

    Imports System.IO

#End Region

Namespace MyCompany.Web.Mvc.Extensions.ActionFilters

    ''' <summary>
    ''' WhitespaceFilter attribute
    ''' </summary>
    Public NotInheritable Class WhitespaceFilterAttribute
        Inherits ActionFilterAttribute

        ''' <summary>
        ''' Called when action executing.   
        ''' </summary>
        ''' <param name="filterContext">The filter context.</param>
        ''' <remarks></remarks>
        Public Overrides Sub OnActionExecuting(filterContext As ActionExecutingContext)

                filterContext.HttpContext.Response.Filter = New WhitespaceFilterStream(filterContext.HttpContext.Response.Filter)

        End Sub

    #Region "Whitespace stream filter"

            ''' <summary>
            ''' Whitespace stream filter
            ''' </summary>
            Private Class WhitespaceFilterStream
                Inherits Stream

    #Region "Declarations"

                ' Member vars.
                Private Shared regexPattern As New Regex("(?<=[^])\t{2,}|(?<=[>])\s{2,}(?=[<])|(?<=[>])\s{2,11}(?=[<])|(?=[\n])\s{2,}")
                ' Property vars.
                Private sinkStreamValue As Stream
                Private positionValue As Long

    #End Region

    #Region "Constructor(s)"

                ''' <summary>
                ''' Contructor to create a new object.
                ''' </summary>
                ''' <param name="sink"></param>
                ''' <remarks></remarks>
                Public Sub New(sink As Stream)

                    Me.sinkStreamValue = sink

                End Sub

    #End Region

    #Region "Properites"

                ''' <summary>
                ''' Gets the CanRead value.
                ''' </summary>
                ''' <value></value>
                ''' <returns></returns>
                ''' <remarks></remarks>
                Public Overrides ReadOnly Property CanRead() As Boolean
                    Get
                        Return True
                    End Get
                End Property

                ''' <summary>
                ''' Gets the CanSeek value.
                ''' </summary>
                ''' <value></value>
                ''' <returns></returns>
                ''' <remarks></remarks>
                Public Overrides ReadOnly Property CanSeek() As Boolean
                    Get
                        Return True
                    End Get
                End Property

                ''' <summary>
                ''' Gets the CanWrite value.
                ''' </summary>
                ''' <value></value>
                ''' <returns></returns>
                ''' <remarks></remarks>
                Public Overrides ReadOnly Property CanWrite() As Boolean
                    Get
                        Return True
                    End Get
                End Property

                ''' <summary>
                ''' Get Length value.
                ''' </summary>
                ''' <value></value>
                ''' <returns></returns>
                ''' <remarks></remarks>
                Public Overrides ReadOnly Property Length() As Long
                    Get
                        Return 0
                    End Get
                End Property

                ''' <summary>
                ''' Get or sets Position value.
                ''' </summary>
                ''' <value></value>
                ''' <returns></returns>
                ''' <remarks></remarks>
                Public Overrides Property Position() As Long
                    Get
                        Return Me.positionValue
                    End Get
                    Set(value As Long)
                        Me.positionValue = value
                    End Set
                End Property

    #End Region

    #Region "Stream Overrides Methods"

                ''' <summary>
                ''' Stream object Close method.
                ''' </summary>
                ''' <remarks></remarks>
                Public Overrides Sub Close()

                    Me.sinkStreamValue.Close()

                End Sub

                ''' <summary>
                ''' Stream object Close method.
                ''' </summary>
                ''' <remarks></remarks>
                Public Overrides Sub Flush()

                    Me.sinkStreamValue.Flush()

                End Sub

                ''' <summary>
                ''' Stream object Read method.
                ''' </summary>
                ''' <param name="buffer"></param>
                ''' <param name="offset"></param>
                ''' <param name="count"></param>
                ''' <returns></returns>
                ''' <remarks></remarks>
                Public Overrides Function Read(buffer As Byte(), offset As Integer, count As Integer) As Integer

                    Return Me.sinkStreamValue.Read(buffer, offset, count)

                End Function

                ''' <summary>
                ''' Stream object Seek method.
                ''' </summary>
                ''' <param name="offset"></param>
                ''' <param name="origin"></param>
                ''' <returns></returns>
                ''' <remarks></remarks>
                Public Overrides Function Seek(offset As Long, origin As SeekOrigin) As Long

                    Return Me.sinkStreamValue.Seek(offset, origin)

                End Function

                ''' <summary>
                ''' Stream object SetLength method.
                ''' </summary>
                ''' <param name="value"></param>
                ''' <remarks></remarks>
                Public Overrides Sub SetLength(value As Long)

                    Me.sinkStreamValue.SetLength(value)

                End Sub

                ''' <summary>
                ''' Stream object Write method.
                ''' </summary>
                ''' <param name="bufferBytes"></param>
                ''' <param name="offset"></param>
                ''' <param name="count"></param>
                ''' <remarks></remarks>
                Public Overrides Sub Write(bufferBytes As Byte(), offset As Integer, count As Integer)

                    Dim html As String = Encoding.Default.GetString(bufferBytes)

                    Buffer.BlockCopy(bufferBytes, offset, New Byte(count - 1) {}, 0, count)
                    html = regexPattern.Replace(html, String.Empty)
                    Me.sinkStreamValue.Write(Encoding.Default.GetBytes(html), 0, Encoding.Default.GetBytes(html).GetLength(0))

                End Sub

    #End Region

            End Class

    #End Region

        End Class

    End Namespace

Y en Global.asax.vb:

Shared Sub RegisterGlobalFilters(ByVal filters As GlobalFilterCollection)

    With filters
        ' Standard MVC filters
        .Add(New HandleErrorAttribute())
        ' MyCompany MVC filters
        .Add(New CompressionFilterAttribute)
        .Add(New WhitespaceFilterAttribute)
    End With

End Sub
 1
Author: Ed DeGagne,
Warning: date(): Invalid date.timezone value 'Europe/Kyiv', we selected the timezone 'UTC' for now. in /var/www/agent_stack/data/www/ajaxhispano.com/template/agent.layouts/content.php on line 61
2012-10-11 19:09:56

El espacio en blanco se comprime bastante bien, no creo que eliminarlo te vaya a ahorrar mucho.

Sugeriría intentar descargar algo del HTML al cliente si es posible, usar JavaScript para reconstituir cosas que se repiten.

 0
Author: great_llama,
Warning: date(): Invalid date.timezone value 'Europe/Kyiv', we selected the timezone 'UTC' for now. in /var/www/agent_stack/data/www/ajaxhispano.com/template/agent.layouts/content.php on line 61
2009-05-13 00:47:44

Si devuelve JSON desde la vista, ya está minificado y no debe contener ningún espacio en blanco o CR/LF. Debe usar paginación para evitar enviar tantos datos al navegador a la vez.

 -3
Author: Dave Swersky,
Warning: date(): Invalid date.timezone value 'Europe/Kyiv', we selected the timezone 'UTC' for now. in /var/www/agent_stack/data/www/ajaxhispano.com/template/agent.layouts/content.php on line 61
2009-05-13 00:37:54