OpenCL performance on intel i5-11400 by Clpeak

Platform: Intel(R) CPU Runtime for OpenCL(TM) Applications Device: 11th Gen Intel(R) Core(TM) i5-11400 @ 2.60GHz Driver version : 18.1.0.0920 (Linux x64) Compute units : 12 Clock frequency : 2600 MHz Global memory bandwidth (GBPS) float : 15.99 float2 : 17.11 float4 : 13.20 float8 : 12.18 float16 : 11.62 Single-precision compute (GFLOPS) float : 400.87 float2 : 788.43 float4 : 776.47 float8 : 774.72 float16 : 771.39 No half precision support! Skipped Double-precision compute (GFLOPS) double : 396.97 double2 : 389.92 double4 : 389.38 double8 : 388.42 double16 : 283.72 Integer compute (GIOPS) int : 146.78 int2 : 290.23 int4 : 399.68 int8 : 180.19 int16 : 185.39 Integer compute Fast 24bit (GIOPS) int : 113.77 int2 : 182.26 int4 : 193.19 int8 : 191.54 int16 : 198.80 Integer char (8bit) compute (GIOPS) char : 123.82 char2 : 235.34 char4 : 374.39 char8 : 367.15 char16 : 390.30 Integer short (16bit) compute (GIOPS) short : 250.93 short2 : 482.91 short4 : 840.59 short8 : 281.22 short16 : 336.37 Transfer bandwidth (GBPS) enqueueWriteBuffer : 7.22 enqueueReadBuffer : 7.64 enqueueWriteBuffer non-blocking : 7.23 enqueueReadBuffer non-blocking : 7.62 enqueueMapBuffer(for read) : 8349.47 memcpy from mapped ptr : 7.64 enqueueUnmap(after write) : 7814.71 memcpy to mapped ptr : 7.24 Kernel launch latency : 1.57 us

Jan 15, 2025 - 07:30
OpenCL performance on intel i5-11400 by Clpeak
Platform: Intel(R) CPU Runtime for OpenCL(TM) Applications
  Device: 11th Gen Intel(R) Core(TM) i5-11400 @ 2.60GHz
    Driver version  : 18.1.0.0920 (Linux x64)
    Compute units   : 12
    Clock frequency : 2600 MHz

    Global memory bandwidth (GBPS)
      float   : 15.99
      float2  : 17.11
      float4  : 13.20
      float8  : 12.18
      float16 : 11.62

    Single-precision compute (GFLOPS)
      float   : 400.87
      float2  : 788.43
      float4  : 776.47
      float8  : 774.72
      float16 : 771.39

    No half precision support! Skipped

    Double-precision compute (GFLOPS)
      double   : 396.97
      double2  : 389.92
      double4  : 389.38
      double8  : 388.42
      double16 : 283.72

    Integer compute (GIOPS)
      int   : 146.78
      int2  : 290.23
      int4  : 399.68
      int8  : 180.19
      int16 : 185.39

    Integer compute Fast 24bit (GIOPS)
      int   : 113.77
      int2  : 182.26
      int4  : 193.19
      int8  : 191.54
      int16 : 198.80

    Integer char (8bit) compute (GIOPS)
      char   : 123.82
      char2  : 235.34
      char4  : 374.39
      char8  : 367.15
      char16 : 390.30

    Integer short (16bit) compute (GIOPS)
      short   : 250.93
      short2  : 482.91
      short4  : 840.59
      short8  : 281.22
      short16 : 336.37

    Transfer bandwidth (GBPS)
      enqueueWriteBuffer              : 7.22
      enqueueReadBuffer               : 7.64
      enqueueWriteBuffer non-blocking : 7.23
      enqueueReadBuffer non-blocking  : 7.62
      enqueueMapBuffer(for read)      : 8349.47
        memcpy from mapped ptr        : 7.64
      enqueueUnmap(after write)       : 7814.71
        memcpy to mapped ptr          : 7.24

    Kernel launch latency : 1.57 us