Netty

Clone Tools
  • last updated 52 mins ago
Constraints
Constraints: committers
 
Constraints: files
Constraints: dates
Use same JDK SSL test workaround when using ACCP as when just using the JDK SSL implementation (#9490)

Motivation:

14607979f6db074247d764cc4583461bcd298719 added tests for using ACCP but did miss to use the same unwrapping technique of exceptions as JdkSslEngineTest. This can lead to test-failures on specific JDK versions

Modifications:

Add the same unwrapping code

Result:

No more test failures

Simplify EventLoop abstractions for timed scheduled tasks (#9470)

Motivation

The epoll transport was updated in #7834 to decouple setting of the

timerFd from the event loop, so that scheduling delayed tasks does not

require waking up epoll_wait. To achieve this, new overridable hooks

were added in the AbstractScheduledEventExecutor and

SingleThreadEventExecutor superclasses.

However, the minimumDelayScheduledTaskRemoved hook has no current

purpose and I can't envisage a _practical_ need for it. Removing

it would reduce complexity and avoid supporting this specific

API indefinitely. We can add something similar later if needed

but the opposite is not true.

There also isn't a _nice_ way to use the abstractions for

wakeup-avoidance optimizations in other EventLoops that don't have a

decoupled timer.

This PR replaces executeScheduledRunnable and

wakesUpForScheduledRunnable

with two new methods before/afterFutureTaskScheduled that have slightly

different semantics:

- They only apply to additions; given the current internals there's no

practical use for removals

- They allow per-submission wakeup decisions via a boolean return val,

which makes them easier to exploit from other existing EL impls (e.g.

NIO/KQueue)

- They are subjectively "cleaner", taking just the deadline parameter

and not exposing Runnables

- For current EL/queue impls, only the "after" hook is really needed,

but specialized blocking queue impls can conditionally wake on task

submission (I have one lined up)

Also included are further optimization/simplification/fixes to the

timerFd manipulation logic.

Modifications

- Remove AbstractScheduledEventExecutor#minimumDelayScheduledTaskRemoved()

and supporting methods

- Uplift NonWakeupRunnable and corresponding default wakesUpForTask()

impl from SingleThreadEventLoop to SingleThreadEventExecutor

- Change executeScheduledRunnable() to be package-private, and have a

final impl in SingleThreadEventExecutor which triggers new overridable

hooks before/afterFutureTaskScheduled()

- Remove unnecessary use of bookend tasks while draining the task queue

- Use new hooks to add simpler wake-up avoidance optimization to

NioEventLoop (primarily to demonstrate utility/simplicity)

- Reinstate removed EpollTest class

In EpollEventLoop:

- Refactor to use only the new afterFutureTaskScheduled() hook for

updating timerFd

- Fix setTimerFd race condition using a monitor

- Set nextDeadlineNanos to a negative value while the EL is awake and

use this to block timer changes from outside the EL. Restore the

known-set value prior to sleeping, updating timerFd first if necessary

- Don't read from timerFd when processing expiry event

Result

- Cleaner API for integrating with different EL/queue timing impls

- Fixed race condition to avoid missing scheduled wakeups

- Eliminate unnecessary timerFd updates while EL is awake, and

unnecessary expired timerFd reads

- Avoid unnecessary scheduled-task wakeups when using NIO transport

I did not yet further explore the suggestion of using

TFD_TIMER_ABSTIME for the timerFd.

Update javadoc for NioEventLoop.setRatio() #9481 (#9484)

Motivation:

Improve java apidoc for NioEventLoop.setRatio

Modification:

javadoc

Result:

Fixes #9481

Add tests for using Amazon Corretto Crypto Provider with Netty (#9480)

Motivation:

Amazon lately released Amazon Corretto Crypto Provider, so we should include it in our testsuite

Modifications:

Add tests related to Amazon Corretto Crypto Provider

Result:

Test netty with Amazon Corretto Crypto Provider

Epoll: Avoid redundant EPOLL_CTL_MOD calls (#9397)

Motivation

Currently an epoll_ctl syscall is made every time there is a change to

the event interest flags (EPOLLIN, EPOLLOUT, etc) of a channel. These

are only done in the event loop so can be aggregated into 0 or 1 such

calls per channel prior to the next call to epoll_wait.

Modifications

I think further streamlining/simplification is possible but for now I've

tried to minimize structural changes and added the aggregation beneath

the existing flag manipulation logic.

A new AbstractChannel#activeFlags field records the flags last set on

the epoll fd for that channel. Calls to setFlag/clearFlag update the

flags field as before but instead of calling epoll_ctl immediately, just

set or clear a bit for the channel in a new bitset in the associated

EpollEventLoop to reflect whether there's any change to the last set

value.

Prior to calling epoll_wait the event loop makes the appropriate

epoll_ctl(EPOLL_CTL_MOD) call once for each channel who's bit is set.

Result

Fewer syscalls, particularly in some auto-read=false cases. Simplified

error handling from centralization of these calls.

Ensure we replace WebSocketServerProtocolHandshakeHandler before doing the handshake (#9472)

Motivation:

We need to ensure we replace WebSocketServerProtocolHandshakeHandler before doing the actual handshake as the handshake itself may complete directly and so forward pending bytes through the pipeline.

Modifications:

Replace the handler before doing the actual handshake.

Result:

Fixes https://github.com/netty/netty/issues/9471.

Detect truncated responses caused by EDNS0 and MTU miss-match (#9468)

Motivation:

It is possible that the user uses a too big EDNS0 setting for the MTU and so we may receive a truncated datagram packet. In this case we should try to detect this and retry via TCP if possible

Modifications:

- Fix detecting of incomplete records

- Mark response as truncated if we did not consume the whole packet

- Add unit test

Result:

Fixes https://github.com/netty/netty/issues/9365

AsciiString contentEqualsIgnoreCase fails when arrayOffset is non-zero (#9477)

Motivation:

AsciiString.contentEqualsIgnoreCase may return true for non-matching strings of equal length when offset is non zero.

Modifications:

- Correctly take offset into account

- Add unit test

Result:

Fixes #9475

Avoid creating FileInputStream and FileOutputStream for obtaining Fil… (#8110)

Motivation:

If all we need is the FileChannel we should better use RandomAccessFile as FileInputStream and FileOutputStream use a finalizer.

Modifications:

Replace FileInputStream and FileOutputStream with RandomAccessFile when possible.

Result:

Fixes https://github.com/netty/netty/issues/8078.

Replace synchronized usage with ConcurrentHashMap in *Bootstrap classes (#9458)

Motivation:

In AbstractBoostrap, options and attrs are LinkedHashMap that are synchronized on for every read, copy/clone, write operation.

When a lot of connections are triggered concurrently on the same bootstrap instance, the synchronized blocks lead to contention, Netty IO threads get blocked, and performance may be severely degraded.

Modifications:

Use ConcurrentHashMap

Result:

Less contention. Fixes https://github.com/netty/netty/issues/9426

Cleanup docker / docker-compose configs (#9473)

Motivation:

We should use the same java versions whenever we use CentOS 6 or 7 and also use the latest Java12 version

Modifications:

- Use same Java versions

- Use latest Java 12 version

- Remove old configs which are not used anymore

Result:

Docker cleanup

    • -1
    • +1
    /docker/docker-compose.centos-6.110.yaml
    • -1
    • +1
    /docker/docker-compose.centos-6.112.yaml
    • -1
    • +1
    /docker/docker-compose.centos-6.graalvm1.yaml
    • -1
    • +1
    /docker/docker-compose.centos-7.110.yaml
    • -1
    • +1
    /docker/docker-compose.centos-7.112.yaml
    • -1
    • +1
    /docker/docker-compose.centos-7.19.yaml
    • -34
    • +0
    /docker/docker-sync-compose.centos-6.18.yaml
HTTP2: Update local flow-controller on Channel.read() if needed (#9400)

Motivation:

We should better update the flow-controller on Channel.read() to reduce overhead and memory overhead.

See https://github.com/netty/netty/pull/9390#issuecomment-513008269

Modifications:

Move updateLocalWindowIfNeeded() to doBeginRead()

Result:

Reduce memory overhead

Don't zero non-readable buffer regions when capacity is decreased (#9427)

Motivation

#1802 fixed ByteBuf implementations to ensure that the whole buffer

region is preserved when capacity is increased, not just the readable

part. The behaviour is still different however when the capacity is

_decreased_ - data outside the currently-readable region is zeroed.

Modifications

Update ByteBuf capacity(int) implementations to also copy the whole

buffer region when the new capacity is less than the current capacity.

Result

Consistent behaviour of ByteBuf#capacity(int) regardless of whether the

new capacity is greater than or less than the current capacity.

Fix possible NPE when using HttpClientCodec (#9465)

Motivation:

It was possible to produce a NPE when we for examples received more responses as requests as we did not check if the queue did not contain a method before trying to compare method names.

Modifications:

- Add extra null check

- Add unit tet

Result:

Fixes https://github.com/netty/netty/issues/9459

Http2EmptyDataFrameConnectionDecoder.frameListener() should return unwrapped Http2FrameListener (#9467)

Motivation:

As we decorate the Http2FrameListener under the covers we should ensure the user can still access the original Http2FrameListener.

Modifications:

- Unwrap the Http2FrameListener in frameListener()

- Add unit test

Result:

Less suprises for users.

DnsNameResolverTest.testTruncated0(...) should only close socket once envelope is received (#9469)

Motivation:

We should only ever close the underlying tcp socket once we received the envelope to ensure we never race in the test.

Modifications:

- Only close socket once we received the envelope

- Set REUSE_ADDR

Result:

More robust test

Correctly respect mask parameters in all WebSocketClientHandshakerFactory#newHandshaker(...) methods (#9464)

Motivation:

We did not correctly pass the mask parameters in all cases.

Modifications:

Correctly pass on parameters

Result:

Fixes https://github.com/netty/netty/issues/9463.

Fix ByteBuf leak in Http2ControlFrameLimitEncoderTest (#9466)

Motivation:

We recently introduced Http2ControlFrameLimitEncoderTest which did not correctly notify the goAway promises and so leaked buffers.

Modifications:

Correctly notify all promises and so release the debug data.

Result:

Fixes leak in HTTP2 test

EPOLL - decouple schedule tasks from epoll_wait life cycle (#7834)

Motivation:

EPOLL supports decoupling the timed wakeup mechanism from the selector call. The EPOLL transport takes advantage of this in order to offer more fine grained timer resolution. However we are current calling timerfd_settime on each call to epoll_wait and this is expensive. We don't have to re-arm the timer on every call to epoll_wait and instead only have to arm the timer when a task is scheduled with an earlier expiration than any other existing scheduled task.

Modifications:

- Before scheduled tasks are added to the task queue, we determine if the new

duration is the soonest to expire, and if so update with timerfd_settime. We

also drain all the tasks at the end of the event loop to make sure we service

any expired tasks and get an accurate next time delay.

- EpollEventLoop maintains a volatile variable which represents the next deadline to expire. This variable is modified inside the event loop thread (before calling epoll_wait) and out side the event loop thread (immediately to ensure proper wakeup time).

- Execute the task queue before the schedule task priority queue. This means we

may delay the processing of scheduled tasks but it ensures we transfer all

pending tasks from the task queue to the scheduled priority queue to run the

soonest to expire scheduled task first.

- Deprecate IORatio on EpollEventLoop, and drain the executor and scheduled queue on each event loop wakeup. Coupling the amount of time we are allowed to drain the executor queue to a proportion of time we process inbound IO may lead to unbounded queue sizes and unpredictable latency.

Result:

Fixes https://github.com/netty/netty/issues/7829

- In most cases this results in less calls to timerfd_settime

- Less event loop wakeups just to check for scheduled tasks executed outside the event loop

- More predictable executor queue and scheduled task queue draining

- More accurate and responsive scheduled task execution

Use OpenJDK13 RC (#9457)

Motivation:

The first release canidate for OpenJDK13 was released.

Modifications:

Update version.

Result:

Use latest OpenJDK13 release

    • -1
    • +1
    /docker/docker-compose.centos-6.113.yaml
    • -1
    • +1
    /docker/docker-compose.centos-7.113.yaml
[maven-release-plugin] prepare for next development iteration

  1. … 24 more files in changeset.
[maven-release-plugin] prepare release netty-4.1.39.Final

  1. … 24 more files in changeset.
HTTP2: Guard against empty DATA frames (without end_of_stream flag) set (#9461)

Motivation:

It is possible for a remote peer to flood the server / client with empty DATA frames (without end_of_stream flag) set and so cause high CPU usage without the possibility to ever hit a limit. We need to guard against this.

See CVE-2019-9518

Modifications:

- Add a new config option to AbstractHttp2ConnectionBuilder and sub-classes which allows to set the max number of consecutive empty DATA frames (without end_of_stream flag). After this limit is hit we will close the connection. A limit of 10 is used by default.

- Add unit tests

Result:

Guards against CVE-2019-9518

HTTP2: Add protection against remote control frames that are triggered by a remote peer (#9460)

Motivation:

Due how http2 spec is defined it is possible by a remote peer to flood us with frames that will trigger control frames as response, the problem here is that the remote peer can also just stop reading these (while still produce more of these) and so may drive us to the pointer where we either run out of memory or burn all CPU. To protect against this we need to implement some kind of limit that will tear down connections that cause the above mentioned situation.

See CVE-2019-9512 / CVE-2019-9514 / CVE-2019-9515

Modifications:

- Add Http2ControlFrameLimitEncoder which limits the number of queued control frames that were caused because of the remote peer.

- Allow to insert ths Http2ControlFrameLimitEncoder by setting AbstractHttp2ConnectionBuilder.encoderEnforceMaxQueuedControlFrames(...) to a number higher then 0. The default is 10000 which provides some protection by default but will hopefully not cause too many false-positives.

- Add unit tests

Result:

Protect against DDOS due control frames. Fixes CVE-2019-9512 / CVE-2019-9514 / CVE-2019-9515 .

Use alloc().heapBuffer(...) to allocate new heap buffer.

Motivation

Underlying array allocations in UnpooledHeapByteBuf are intended be done

via the protected allocateArray(int) method, so that they can be tracked

and/or overridden by subclasses, for example

UnpooledByteBufAllocator$InstrumentedUnpooledHeapByteBuf or #8015. But

it looks like an explicit allocation was missed in the copy(int,int)

method.

Modification

Just use alloc().heapBuffer(...) for the allocation

Result

No possibility of "missing" array allocations when ByteBuf#copy is used.

Delay Http2ConnectionPrefaceAndSettingsFrameWrittenEvent by one EventLoop tick when using the Http2FrameCodec (#9442)

Motivation:

We should delay the firing of the Http2ConnectionPrefaceAndSettingsFrameWrittenEvent by one EventLoop tick when using the Http2FrameCodec to ensure all handlers are added to the pipeline before the event is passed through it.

This is needed to workaround a race that could happen when the preface is send in handlerAdded(...) but a later handler wants to act on the event.

Modifications:

Offload firing of the event to the EventExecutor.

Result:

Fixes https://github.com/netty/netty/issues/9432.

Use delegated docker mount option to speedup builds (#9441)

Motivation:

As we use the docker files for the CI we should use the delegated mount option to speed up builds.

See https://docs.docker.com/docker-for-mac/osxfs-caching/#delegated

Modifications:

Use delegated mount option

Result:

Faster builds when using docker

Always wrap X509ExtendedTrustManager when using OpenSSL and JDK < 11 (#9443)

Motivation:

When using OpenSSL and JDK < 11 is used we need to wrap the user provided X509ExtendedTrustManager to be able to support TLS1.3. We had a check in place that first tried to see if wrapping is needed at all which could lead to missleading calls of the user provided trustmanager. We should remove these calls and just always wrap if needed.

Modifications:

Always wrap if OpenSSL + JDK < 11 and TLS1.3 is supported

Result:

Less missleading calls to user provided trustmanager

Try to load native linux libraries with matching classifier first (#9411)

Motivation:

Users' runtime systems may have incompatible dynamic libraries to the ones our

tcnative wrappers link to. Unfortunately, we cannot determine and catch these

scenarios (in which the JVM crashes) but we can make a more educated guess on

what library to load and try to find one that works better before crashing.

Modifications:

1) Build dynamically linked openSSL builds for more OSs (netty-tcnative)

2) Load native linux libraries with matching classifier (first)

Result:

More developers / users can use the dynamically-linked native libraries.

Do not cache local/remote address when creating EpollDatagramChannel with InternetProtocolFamily (#9436)

Motivation:

EpollDatagramChannel#localAddress returns wrong information when

EpollDatagramChannel is created with InternetProtocolFamily,

and EpollDatagramChannel#localAddress is invoked BEFORE the actual binding.

This is a regression caused by change

https://github.com/netty/netty/commit/e17ce934da501788c3737631a02d02ee89ebf9df

Modifications:

EpollDatagramChannel() and EpollDatagramChannel(InternetProtocolFamily family)

do not cache local/remote address

Result:

Rebinding on the same address without "reuse port" works

EpollDatagramChannel#localAddress returns correct address